This should be pretty much fixed on build 77.  It will lock up for the
duration of a single command timeout, but ZFS should recover quickly
without queueing up additional commands.  Since the default timeout is
60 seconds, and we retry 3 times, and we do a probe afterwards, you may
see hangs of up to 6 minutes.  Unfortunately there's not much we can do,
since that's the minimum amount of time to do two I/O operations to a
single drive (one that fails and one to do a basic probe of the disk).
You can tune down 'sd_io_time' to a more reasonable value to get shorter
command timeouts, but this may break slow things (like powered down
CD-ROM drives).

Other options at the ZFS level could be imagined, but would require
per-pool tunables:

1. Allowing I/O to complete as soon as it was on enough devices, instead
   of replicating to all devices.

2. Inventing a per-pool tunable that controlled timeouts independent
   of SCSI timeouts.

Neither of these is trivial, and both potentially compromise data
integrity, hence the lack of such features.  There's no easy solution to
the problem, but we're happy to hear ideas.

- Eric

On Thu, Jan 03, 2008 at 02:57:08PM -0700, Jason J. W. Williams wrote:
> Hello,
> 
> There seems to be a persistent issue we have with ZFS where one of the
> SATA disk in a zpool on a Thumper starts throwing sense errors, ZFS
> does not offline the disk and instead hangs all zpools across the
> system. If it is not caught soon enough, application data ends up in
> an inconsistent state. We've had this issue with b54 through b77 (as
> of last night).
> 
> We don't seem to be the only folks with this issue reading through the
> archives. Are there any plans to fix this behavior? It really makes
> ZFS less than desirable/reliable.
> 
> Best Regards,
> Jason
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

--
Eric Schrock, FishWorks                        http://blogs.sun.com/eschrock
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to