Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-02-14 Thread Remco van Bekkum
On Tue, Feb 12, 2008 at 01:03:35AM +0100, Torfinn Ingolfsen wrote: > On Mon, 11 Feb 2008 13:00:57 +0100 > [EMAIL PROTECTED] (Remco van Bekkum) wrote: > > > here? It's on an amd64, Asus m2a-vm with ati xp600, AMD BE-2350 CPU, > > 2GB 800MHz RAM. > > FWIW, I have the almost the same motherboard (m2

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-02-11 Thread Torfinn Ingolfsen
On Mon, 11 Feb 2008 13:00:57 +0100 [EMAIL PROTECTED] (Remco van Bekkum) wrote: > here? It's on an amd64, Asus m2a-vm with ati xp600, AMD BE-2350 CPU, > 2GB 800MHz RAM. FWIW, I have the almost the same motherboard (m2a-vm hdmi) with an AMD Phenom 9500 and 4GB RAM[1]. Different disk, though. The (

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-02-11 Thread Remco van Bekkum
On Mon, Feb 11, 2008 at 07:24:55AM -1000, Clifton Royston wrote: > On Mon, Feb 11, 2008 at 01:00:57PM +0100, Remco van Bekkum wrote: > > On Fri, Jan 25, 2008 at 04:38:46PM -0800, Jeremy Chadwick wrote: > > After having replaced my first SATA disk with one of the same type, > > having still the same

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-02-11 Thread Clifton Royston
On Mon, Feb 11, 2008 at 01:00:57PM +0100, Remco van Bekkum wrote: > On Fri, Jan 25, 2008 at 04:38:46PM -0800, Jeremy Chadwick wrote: > After having replaced my first SATA disk with one of the same type, > having still the same errors, I replaced this 1TB drive with 4x500GB > Hitachi P7K500 in raidz

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-02-11 Thread Remco van Bekkum
On Mon, Feb 11, 2008 at 01:00:57PM +0100, Remco van Bekkum wrote: > On Fri, Jan 25, 2008 at 04:38:46PM -0800, Jeremy Chadwick wrote: > > Joe, I wanted to send you a note about something that I'm still in the > > process of dealing with. The timing couldn't be more ironic. > > > > I decided it wou

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-02-11 Thread Remco van Bekkum
On Fri, Jan 25, 2008 at 04:38:46PM -0800, Jeremy Chadwick wrote: > Joe, I wanted to send you a note about something that I'm still in the > process of dealing with. The timing couldn't be more ironic. > > I decided it would be worthwhile to migrate from my two-disk ZFS stripe > with a non-ZFS dis

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-28 Thread J H
Richard Todd wrote: Workaround: always make sure you run /etc/rc.d/hostid start in single-user before doing any ZFS tinkering. Good advice -- thank you. But it still sounds like Jeremy's assessment, "it's a bug", is accurate. ZFS could certainly check for zero hostid. If zero, it shoul

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-28 Thread J H
it should /definitely/ display a diagnostic which encourages the admin to use /etc/rc.d/hostid Ahhh, rather, display a diagnostic which encourages the use of "zpool import -a". --JH ___ freebsd-stable@freebsd.org mailing list http://lists.freeb

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-27 Thread Danny Braniss
> Henri Hennebert wrote: > > Jeremy Chadwick wrote: > >> On Fri, Jan 25, 2008 at 06:17:24PM -0700, Joe Peterson wrote: > >>> Glad you got it back! Yes, when I was first playing with ZFS, I noti= > ced > >>> that booting between single and multi user mode could make the pools > >>> "invisible". I

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-26 Thread Jeremy Chadwick
On Sat, Jan 26, 2008 at 01:15:31PM -0700, Joe Peterson wrote: > Joe Peterson wrote: > > So I have started a "SeaTools" (disk scanner from Seagate) "long test" of > > the > > drive. The short test passed already. The results should be interesting. > > If > > it finds nothing wrong, I am going t

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-26 Thread Joe Peterson
Ivan Voras wrote: > Were both tests done in the same machine (actually, I mean the same PSU)? Yes - I deliberately changed nothing (not even cables) before I ran the tests. I didn't want any variables. -Joe ___ freebsd-s

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-26 Thread Richard Todd
Joe Peterson <[EMAIL PROTECTED]> writes: > Glad you got it back! Yes, when I was first playing with ZFS, I noticed > that booting between single and multi user mode could make the pools > "invisible". Import seemed to bring them back... Yeah. ZFS pools record the hostid of the system that acce

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-26 Thread Ivan Voras
Joe Peterson wrote: Joe Peterson wrote: So I have started a "SeaTools" (disk scanner from Seagate) "long test" of the drive. The short test passed already. The results should be interesting. If it finds nothing wrong, I am going to start to wonder if I am experiencing ZFS bugs that just happe

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-26 Thread Ivan Voras
Henri Hennebert wrote: Jeremy Chadwick wrote: On Fri, Jan 25, 2008 at 06:17:24PM -0700, Joe Peterson wrote: Glad you got it back! Yes, when I was first playing with ZFS, I noticed that booting between single and multi user mode could make the pools "invisible". Import seemed to bring them bac

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-26 Thread Joe Peterson
Joe Peterson wrote: > So I have started a "SeaTools" (disk scanner from Seagate) "long test" of the > drive. The short test passed already. The results should be interesting. If > it finds nothing wrong, I am going to start to wonder if I am experiencing ZFS > bugs that just happen to look like

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-26 Thread Joe Peterson
I performed a ZFS scrub, which finished yesterday, and no new /var/log/messages errors were reported during that time. However, the scrub found something interesting: crater# zpool status -v pool: tank state: ONLINE status: One or more devices has experienced an error resulting in data

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-26 Thread Henri Hennebert
Jeremy Chadwick wrote: On Fri, Jan 25, 2008 at 06:17:24PM -0700, Joe Peterson wrote: Glad you got it back! Yes, when I was first playing with ZFS, I noticed that booting between single and multi user mode could make the pools "invisible". Import seemed to bring them back... I did go into sin

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Ian Smith
On Fri, 25 Jan 2008, Jeremy Chadwick wrote: > On Fri, Jan 25, 2008 at 06:03:33PM -0700, Joe Peterson wrote: > > Wow, pretty crazy! Hmm, and yes, those LBAs do look close together. > > Well, let me know how the smartctl output looks. I'd be curious if your > > bad sector count rises. > > Ab

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Andrew MacIntyre
Jeremy Chadwick wrote: * Getting a larger power supply (usually when lots of disk are involved) I only have two drives, so I think the PS has enough capacity in my case. Agreed; even a 350W PSU should handle 2 disks without a problem. I've seen power supplies with a sagging 12V rail cause th

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Jeremy Chadwick
On Fri, Jan 25, 2008 at 06:17:24PM -0700, Joe Peterson wrote: > Glad you got it back! Yes, when I was first playing with ZFS, I noticed > that booting between single and multi user mode could make the pools > "invisible". Import seemed to bring them back... I did go into single-user mode and att

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Jeremy Chadwick
On Fri, Jan 25, 2008 at 06:03:33PM -0700, Joe Peterson wrote: > Wow, pretty crazy! Hmm, and yes, those LBAs do look close together. > Well, let me know how the smartctl output looks. I'd be curious if your > bad sector count rises. Absolutely nada on the SMART statistics. Nothing incremented or

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Joe Peterson
Glad you got it back! Yes, when I was first playing with ZFS, I noticed that booting between single and multi user mode could make the pools "invisible". Import seemed to bring them back... So, is the disk toast, or can you still read anything from it (part table, etc.)? -Joe Jeremy C

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Jeremy Chadwick
On Fri, Jan 25, 2008 at 05:00:54PM -0800, Jeremy Chadwick wrote: > icarus# zfs list > no datasets available > > This doesn't bode well, and doesn't make me happy. At all. Pshew! I was able to get ZFS to start seeing the pool again by doing the following: (Supposedly "zpool import" by itself wi

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Joe Peterson
Jeremy Chadwick wrote: > Joe, I wanted to send you a note about something that I'm still in the > process of dealing with. The timing couldn't be more ironic. > > I decided it would be worthwhile to migrate from my two-disk ZFS stripe > with a non-ZFS disk for nightly backups, to to a RAIDZ pool

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Jeremy Chadwick
On Fri, Jan 25, 2008 at 04:38:46PM -0800, Jeremy Chadwick wrote: > I'll have to poke at SMART stats later to see what showed up. So the box did indeed panic. The backtrace contained about 1.5 screens of function calls from the stack, which makes taking a photo of the screen a bit worthless. All

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Jeremy Chadwick
Joe, I wanted to send you a note about something that I'm still in the process of dealing with. The timing couldn't be more ironic. I decided it would be worthwhile to migrate from my two-disk ZFS stripe with a non-ZFS disk for nightly backups, to to a RAIDZ pool of all 3 disks combined (since th

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Jeremy Chadwick
On Fri, Jan 25, 2008 at 12:24:20PM -0700, Joe Peterson wrote: > In my case, I am using only one disk (ad0) for FreeBSD, and I am only > using one partition on this disk in my ZFS pool. So, in this case, > unfortunately, it's not possible to tell from the fact that only ad0 is > listed that it is s

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Chuck Swiger
On Jan 25, 2008, at 1:05 PM, Thomas Hurst wrote: These numbers are quite worrysome-- they should be zero or nearly so in a healthy drive. No, these are perfectly reasonable for a Seagate. I have about 12 7200.X's and all show the same sort of behavior. If they're nearly zero it's probabl

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Thomas Hurst
* Chuck Swiger ([EMAIL PROTECTED]) wrote: > On Jan 25, 2008, at 11:24 AM, Joe Peterson wrote: >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED >> WHEN_FAILED RAW_VALUE >> 1 Raw_Read_Error_Rate 0x000f 114 071 006Pre-fail Always >> - 8242294

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Jeremy Chadwick
On Fri, Jan 25, 2008 at 12:46:08PM -0800, Chuck Swiger wrote: > On Jan 25, 2008, at 11:24 AM, Joe Peterson wrote: >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED >> WHEN_FAILED RAW_VALUE >> 1 Raw_Read_Error_Rate 0x000f 114 071 006Pre-fail Always >>

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Joe Peterson
Chuck Swiger wrote: > On Jan 25, 2008, at 11:24 AM, Joe Peterson wrote: >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE >> UPDATED WHEN_FAILED RAW_VALUE >> 1 Raw_Read_Error_Rate 0x000f 114 071 006Pre-fail >> Always - 82422948 > [ ... ] >> 7 See

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Peter Jeremy
On Fri, Jan 25, 2008 at 12:46:08PM -0800, Chuck Swiger wrote: >On Jan 25, 2008, at 11:24 AM, Joe Peterson wrote: >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED >> WHEN_FAILED RAW_VALUE >> 1 Raw_Read_Error_Rate 0x000f 114 071 006Pre-fail Always >>

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Chuck Swiger
On Jan 25, 2008, at 11:24 AM, Joe Peterson wrote: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 114 071 006Pre-fail Always - 82422948 [ ... ] 7 Seek_Error_Rate 0x000f 084

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Joe Peterson
Jeremy Chadwick wrote: > What you've shown is usually the sign of a disk-related problem. It's > very obvious when it's just one disk reporting DMA errors. You use ZFS, > so chances are you have more than one disk in a pool/volume -- there's > no indication ad1, ad4, ad6, etc. are failing, so thi

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Joe Peterson
Jeremy Chadwick wrote: > What you've shown is usually the sign of a disk-related problem. It's > very obvious when it's just one disk reporting DMA errors. You use ZFS, > so chances are you have more than one disk in a pool/volume -- there's > no indication ad1, ad4, ad6, etc. are failing, so thi

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Jeremy Chadwick
On Fri, Jan 25, 2008 at 06:42:04PM +0100, Julian H. Stacey wrote: > Jeremy Chadwick wrote: > > > wondering if this is a known issue. Note that smartctl does not report > > > errors logged and gives a "PASSED" to the drive. I am running at > > > UDMA100 ATA. Also, if it matters, I am using ZFS. >

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Julian H. Stacey
Jeremy Chadwick wrote: > > wondering if this is a known issue. Note that smartctl does not report > > errors logged and gives a "PASSED" to the drive. I am running at > > UDMA100 ATA. Also, if it matters, I am using ZFS. > Can you please provide output of the following: > > * smartctl -a /dev/

Re: "ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Jeremy Chadwick
On Fri, Jan 25, 2008 at 08:58:41AM -0700, Joe Peterson wrote: > I've seen mention of this kind of issue before, but I never saw a > solution, except that someone reported that a certain version of 6.x > seemed to make it go away - accounts of this problem are a bit vague. I > am running 7.0-RC1, a

"ad0: TIMEOUT - WRITE_DMA" type errors with 7.0-RC1

2008-01-25 Thread Joe Peterson
I've seen mention of this kind of issue before, but I never saw a solution, except that someone reported that a certain version of 6.x seemed to make it go away - accounts of this problem are a bit vague. I am running 7.0-RC1, and I am seeing the errors periodically, and I am wondering if this is