Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

Miles Nordin Thu, 09 Oct 2008 11:39:03 -0700

>>>>> "gs" == Greg Shaw <[EMAIL PROTECTED]> writes:


    gs> Nevada isn't production code.  For real ZFS testing, you must
    gs> use a production release, currently Solaris 10 (update 5, soon
    gs> to be update 6).

based on list feedback, my impression is that the results of a
``test'' confined to s10, particularly s10u4 (the latest available
during most of Mike's experience), would be worse than Nevada
experience over the same period.  but I doubt either matches UFS+SVM
or ext3+LVM2.  The on-disk format with ``ditto blocks'' and ``always
consistent'' may be fantastic, but the code for reading it is not.

Maybe the code is stellar, and the problem really is underlying
storage stacks that fail to respect write barriers.  If so, ZFS needs
to include a storage stack qualification tool.  For me it doesn't
strain credibility to believe these problems might be rampant in VM
stacks and SAN's, nor do I find it unacceptable if ZFS is vastly more
sensitive to them than any other filesystem.  If this speculation
turns out to really be the case, I imagine the two going together: the
problems are rampant because they don't bother other filesystems too
catastrophically.  If this is really the situation, then ZFS needs to
give the sysadmin a way to isolate and fix the problems
deterministically before filling the pool with data, not just blame
the sysadmin based on nebulous speculatory hindsight gremlins.

And if it's NOT the case, the ZFS problems need to be acknowledged and
fixed.

To my view, the above is *IN ADDITION* to developing a
recovery/forensic/``fsck'' tool, not either/or.  The pools should not
be getting corrupt in the first place, and pulling the cord should not
mean you have to settle for best-effort.  None of the modern
filesystems demand an fsck after unclean shutdown.

The current procedure for qualifying a platform seems to be: (1)
subject it to heavy write activity, (2) pull the cord, (3) repeat.
Ahmed, maybe you should use that test to ``quantify'' filesystem
reliability.  You can try it with ZFS, then reinstall the machine with
CentOS and try the same test with ext3+LVM2 or xfs+areca.  The numbers
you get are how many times can you pull the cord before you lose
something, and how much do you lose.  Here's a really old test of that
sort comparing Linux filesystems which is something like what I have
in mind:

 https://www.redhat.com/archives/fedora-list/2004-July/msg00418.html

so you see he got two sets of numbers---number of reboots and amount
of corruption.  For reiserfs and JFS he lost their equivalent of ``the
whole pool'', and for ext3 and XFS he got corruption but never lost
the pool.  It's not clear to me the filesystems ever claimed to
prevent corruption in his test scenario (was he calling fsync() after
each log write?  syslog does that sometimes, and if so, they do claim
it, but if he's just writing with some silly script they don't), but
definitely they do all claim you won't lose the whole pool in a power
outage, and only two out of four delivered on that.  I base my choice
of Linux filesystem on this test, and wish I'd done such a test before
converting things to ZFS.

pgpi0TlEstn85.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

Reply via email to