Been mentioned time and time again (and in much more detail) on the ZFS forums. The best post I saw summing up ZFS at this point basically said that ZFS is focused on data integrity, but that does not automatically mean data availability.
Unfortunately that's a bit of an odd distinction, and I feel ZFS is going to catch a few people out. I had the pool hang / timeout issue with iSCSI, where a single offline device hung the entire pool. At the time I mentioned that if ZFS is reliant on device drivers to spot failures, it's possible for a single device fault is capable of hanging your entire pool, and that can be any fault that hasn't been planned for, or a bug in the driver. To my mind it's the whole blacklist / whitelist thing. ZFS should be assuming that drivers can fail, and do its best to manage the pool regardless, instead of assuming that the code is perfect. It already checks the integrity of the data (so it obviously doesn't trust the hardware / drivers completely). Surely adding a timeout check isn't impossible. -- This message posted from opensolaris.org _______________________________________________ storage-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/storage-discuss
