> >>>>> "g" == Gino <dandr...@gmail.com> writes: > > g> we lost many zpools with multimillion$ EMC, > Netapp and > g> HDS arrays just simulating fc switches power > fails. > g> The problem is that ZFS can't properly > recover itself. > I don't like what you call ``the problem''---I think > it assumes too > much. You mistake *A* fix for *THE* problem, before > we can even agree > for sure on, what is the problem. The problem may be > in the solaris > FC initiator, in a corner case of the FC protocol > itself, or in ZFS's > exception handling when a ``SYNCHRONIZE CACHE'' > command returns > failure. > > It's likely other filesystems are affected by ``the > problem'' as I > define it, just much less so. If that's the case, > it'd be much better > IMHO to fix the real problem once and for all, and > find it so that it > stays fixed, than to make ZFS work around it by > losing a tiny bit of > data instead of the whole pool. I don't think ZFS > should feel > entitled to brag about protection from Silent > Corruption, if it were > at the same time willing to silently boot without a > slog, or silently > rollback to an earlier ueberblock, or if it acts like > a cheap USB > stick when an FC switch reboots (by quietly losing > things that were > written long ago).
I agree but I'd like to point out that the MAIN problem with ZFS is that because of a corruption you-ll loose ALL your data and there is no way to recover it. Consider an example where you have 100TB of data and a fc switch fails or other hw problem happens during I/O on a single file. With UFS you'll probably get corruption on that single file. With ZFS you'll loose all your data. I totally agree that ZFS is theoretically much much much much much better than UFS but in real world application having a risk to loose access to an entire pool is not acceptable. gino -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss