[osol-help] ZFS woes

Scott L. Burson Thu, 03 Jan 2008 17:31:05 PST

Hi,

This is in build 74, on x64, on a Tyan S2882-D with dual Opteron 275 and 24GB 
of ECC DRAM.


I seem to have lost the entire contents of a ZFS raidz pool.  The pool is in a 
state where, if ZFS looks at it, I get a kernel panic.  To make it possible to 
boot the machine, I had to boot into safe mode and rename 
`/etc/zfs/zpool.cache' (fortunately, this was my only pool on the machine).

Okay, from the beginning.  I bought the drives in October: three 500GB Western 
Digital WD5000ABYS SATA drives, installed them in the box in place of three 
250GB Seagates I had been using, and created the raidz pool.  For the first 
couple of months everything was hunky dory.  Then, a couple of weeks ago, I 
moved the machine to a different location in the building, which wouldn't even 
be worth mentioning except that that's when I started to have problems.  The 
first time I powered it up, one of the SATA drives didn't show up; I reseated 
the drive connectors and tried again, and it seemed fine.  I thought that was 
odd, since I hadn't had one of those connectors come loose on me before, but I 
scrubbed the pool, cleared the errors on the drive, and thought that was the 
end of it.

It wasn't.  `zpool status' continued to report errors, only now they were write 
and read errors, and spread across all three drives.  I started to copy the 
most critical parts of the filesystem contents onto other machines (very 
fortunately, as it turned out).  After a while, the drive that had previously 
not shown up was marked faulted, and the other two were marked degraded.  Then, 
yesterday, there was a much larger number of errors -- over 3000 read errors -- 
on a different drive, and that drive was marked faulted and the other two (i.e. 
including the one that had previously been faulted) were marked degraded.  
Also, `zpool status' told me I had lost some "files"; these turned out to be 
all, or mostly, directories, some containing substantial trees.

By this point I had already concluded I was going to have to replace a drive, 
and had picked up a replacement.  I installed it in place of the drive that was 
now marked faulted, and powered up.  I was met with repeated panics and 
reboots.  I managed to copy down part of the backtrace:

  unix:die+c8
  unix:trap+1351
  unix:cmntrap+e9
  unix:mutex_enter+b
  zfs:metaslab_free+97
  zfs:zio_dva_free+29
  zfs:zio_next_stage+b3
  zfs:zio_gang_pipeline+??

(This may contain typos, and I didn't get the offset on that last frame.)

At this point I tried replacing the drive I had just removed (removing the new, 
blank drive), but that didn't help.  So, as mentioned above, I tried booting 
into safe mode and renaming `/etc/zfs/zpool.cache' -- just on a hunch, but I 
figured there had to be some such way to make ZFS forget about the pool -- and 
that allowed me to boot.

I used good old `format' to run read tests on the drives overnight -- no bad 
blocks were detected.

So, there are a couple lines of discussion here.  On the one hand, it seems I 
have a hardware problem, but I haven't yet diagnosed it.  More on this below.  
On the other, even in the face of hardware problems, I have to report some 
disappointment with ZFS.  I had really been enjoying the warm fuzzy feeling ZFS 
gave me (and I was talking it up to my colleagues; I'm the only one here using 
it).  Now I'm in a worse state than I would probably be with UFS on RAID, where 
`fsck' would probably have managed to salvage a lot of the filesystem (I would 
certainly be able to mount it! -- unless the drives were all failing 
catastrophically, which doesn't seem to be happening).

One could say, there are two aspects to filesystem robustness: integrity 
checking and recovery.  ZFS, with its block checksums, gets an A in integrity 
checking, but now appears to do very poorly in recovering in the face of 
substantial but not total hardware degradation, when that degradation is 
sufficiently severe that the redundancy of the pool can't correct for it.

Perhaps this is a vanishingly rare case and I am just very unlucky.  
Nonetheless I would like to make some suggestions.  (1) It would still be nice 
to have a salvager.  (2) I think it would make sense, at least as an option, to 
add even more redundancy to ZFS's on-disk layout; for instance, it could keep 
copies of all directories.

Okay, back to my hardware problems.  I know you're going to tell me I probably 
have a bad power supply, and I can't rule that out, but it's an expensive PSU 
and generously sized for the box; and the box had been rock stable for a good 
18 months before this happened.  I'm naturally more inclined to suspect the new 
components, which are the SATA drives.  (I also have three SCSI drives in the 
box for /, swap, etc., and they don't seem to be having any trouble, though I'm 
not running ZFS on them so maybe I wouldn't know.)  It's definitely not DRAM; 
it's all ECC and `fmstat' is not reporting any errors.  On the other hand, it's 
implausible (though not totally so) that three new drives would all suffer 
infant mortality at the same time.  Suggestions invited (I haven't been able to 
get SunVTS to work, alas).

And, if anyone can tell me how to make this pool mountable again, by manually 
fiddling with the superblock or whatever, that would be great (though I'm not 
holding my breath).  I haven't overwritten the drive contents yet, so this 
might conceivably be possible.

Thanks for your time.

-- Scott
 
 
This message posted from opensolaris.org

[osol-help] ZFS woes

Reply via email to