Great to hear a few success stories! We have been experimentally
running ZFS on really crappy hardware and it has never lost a
pool. Running on VB with ZFS/iscsi raw disks we have yet to see
any errors at all. On sun4u with lsi sas/sata it is really rock
solid. And we've been going out of our way to break it because of
bad experiences with ntfs, ext2 and UFS as well as many disk
failures (ever had fsck run amok?).

On 07/31/09 12:11 PM, Richard Elling wrote:

Making flush be a nop destroys the ability to check for errors
thus breaking the trust between ZFS and the data on medium.
-- richard

Can you comment on the issue that the underlying disks were,
as far as we know, never powered down? My understanding is
that disks usually try to flush their caches as quickly as
possible to make room for more data, so in this scenario
things were probably quiet after the guest crash, so likely
what ever was in the cache would have been flushed anyway,
certainly by the time the OP restarted VB and the guest.

Could you also comment on CR 6667683. which I believe is proposed
as a solution for recovery in this very rare case? I understand
that the ZILs are allocated out of the general pool. Is there
a ZIL for the ZILs, or does this make no sense?

As the one who started the whole ECC discussion, I don't think
anyone has ever claimed that lack of ECC caused this loss of
a pool or that it could. AFAIK lack of ECC can't be a problem
at all on RAIDZ vdevs, only with single drives or plain mirrors.
I've suggested an RFE for the mirrored case to double buffer
the writes in this case, but disabling checksums pretty much
fixes the problem if you don't have ECC, so it isn't worth
pursuing. You can disable checksum per file system, so this
is an elegant solution if you don't have ECC memory but
you do mirror. No mirror IMO is suicidal with any file system.

Has anyone ever actually lost a pool on Sun hardware other than
by losing too many replicas or operator error? As you have so
eloquently pointed out, building a reliable storage system is
an engineering problem. There are a lot of folks out there who
are very happy with ZFS on decent hardware. On crappy hardware
you get what you pay for...

Cheers -- Frank (happy ZFS evangelist)
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to