Re: ZFS...

Michelle Sullivan Tue, 30 Apr 2019 03:18:35 -0700


Michelle Sullivan
http://www.mhix.org/
Sent from my iPad

> On 30 Apr 2019, at 19:50, Xin LI <delp...@gmail.com> wrote:
> 
> 
>> On Tue, Apr 30, 2019 at 5:08 PM Michelle Sullivan <miche...@sorbs.net> wrote:
>> but in my recent experience 2 issues colliding at the same time results in 
>> disaster
> 
> Do we know exactly what kind of corruption happen to your pool?  If you see 
> it twice in a row, it might suggest a software bug that should be 
> investigated.

All I know is it’s a checksum error on a meta slab (122) and from what I can 
gather it’s the spacemap that is corrupt... but I am no expert.  I don’t 
believe it’s a software fault as such, because this was cause by a hard outage 
(damaged UPSes) whilst resilvering a single (but completely failed) drive.  
...and after the first outage a second occurred (same as the first but more 
damaging to the power hardware)... the host itself was not damaged nor were the 
drives or controller.

> 
> Note that ZFS stores multiple copies of its essential metadata, and in my 
> experience with my old, consumer grade crappy hardware (non-ECC RAM, with 
> several faulty, single hard drive pool: bad enough to crash almost monthly 
> and damages my data from time to time),

This was a top end consumer grade mb with non ecc ram that had been running for 
8+ years without fault (except for hard drive platter failures.). Uptime would 
have been years if it wasn’t for patching.

> I've never seen a corruption this bad and I was always able to recover the 
> pool. 

So far, same.

> At previous employer, the only case that we had the pool corrupted enough to 
> the point that mount was not allowed was because two host nodes happen to 
> import the pool at the same time, which is a situation that can be avoided 
> with SCSI reservation; their hardware was of much better quality, though.
> 
> Speaking for a tool like 'fsck': I think I'm mostly convinced that it's not 
> necessary, because at the point ZFS says the metadata is corrupted, it means 
> that these metadata was really corrupted beyond repair (all replicas were 
> corrupted; otherwise it would recover by finding out the right block and 
> rewrite the bad ones).

I see this message all the time and mostly agree.. actually I do agree with 
possibly a minor exception, but so minor it’s probably not worth it.  However 
as I suggested in my original post.. the pool says the files are there, a tool 
that would send them (aka zfs send) but ignoring errors to spacemaps etc would 
be real useful (to me.)

> 
> An interactive tool may be useful (e.g. "I saw data structure version 1, 2, 3 
> available, and all with bad checksum, choose which one you would want to 
> try"), but I think they wouldn't be very practical for use with large data 
> pools -- unlike traditional filesystems, ZFS uses copy-on-write and heavily 
> depends on the metadata to find where the data is, and a regular "scan" is 
> not really useful.

Zdb -AAA showed (shows) 36m files..  which suggests the data is intact, but it 
aborts the mount with I/o error because it says metadata has three errors.. 2 
‘metadata’ and one “<storage:0x0>” (storage being the pool name).. it does 
import, and it attempts to resilver but reports the resilver finishes at some 
780M (ish).. export import and it does it all again...  zdb without -AAA aborts 
loading metaslab 122.

> 
> I'd agree that you need a full backup anyway, regardless what storage system 
> is used, though.

Yeah.. unlike UFS that has to get really really hosed to restore from backup 
with nothing recoverable it seems ZFS can get hosed where issues occur in just 
the wrong bit... but mostly it is recoverable (and my experience has been some 
nasty shit that always ended up being recoverable.)

Michelle 
_______________________________________________
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: ZFS...

Reply via email to