On 12-Feb-09, at 3:02 PM, Tim wrote:



On Thu, Feb 12, 2009 at 11:31 AM, David Dyer-Bennet <d...@dd-b.net> wrote:

On Thu, February 12, 2009 10:10, Ross wrote:

> Of course, that does assume that devices are being truthful when they say > that data has been committed, but a little data loss from badly designed > hardware is I feel acceptable, so long as ZFS can have a go at recovering > corrupted pools when it does happen, instead of giving up completely like
> it does now.

Well; not "acceptable" as such. But I'd agree it's outside ZFS's purview. The blame for data lost due to hardware actively lying and not working to
spec goes to the hardware vendor, not to ZFS.

If ZFS could easily and reliably warn about such hardware I'd want it to, but the consensus seems to be that we don't have a reliable qualification procedure. In terms of upselling people to a Sun storage solution, having ZFS diagnose problems with their cheap hardware early is clearly desirable
:-).



Right, well I can't imagine it's impossible to write a small app that can test whether or not drives are honoring correctly by issuing a commit and immediately reading back to see if it was indeed committed or not.

You do realise that this is not as easy as it looks? :) For one thing, the drive will simply serve the read from cache.

It's hard to imagine a test that doesn't involve literally pulling plugs; even better, a purpose built hardware test harness.

Nonetheless I hope that someone comes up with a brilliant test. But if the ZFS team hasn't found one yet... it looks grim :)

--Toby

Like a "zfs test cXtX". Of course, then you can't just blame the hardware everytime something in zfs breaks ;)

--Tim

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to