On 2014-03-07 20:34, Chris Siebenmann wrote:
I have a reproducible kernel crash with OmniOS r151008j. The situation:
The basic setup is a ZFS pool on mirrored pairs of iSCSI disks. The
iSCSI disks come from two different iSCSI targets, and all
targets are multipathed over two 10G networks. The pool is set to
'failmode=continue'. If I start a large streaming write to the pool and
then take down both iSCSI interfaces on both targets (making all disks
in the pool completely unavailable), OmniOS panics after a couple of
minutes. Fortunately this doesn't happen if only a single target becomes
inaccessible.
By "pointing my finger into the sky" I might guesstimate that since you
have some streaming writes and they do go on, some buffer space becomes
exhausted (perhaps the hanging ZIOs waiting for the storage backends to
come back). I would expect the write()'s to not return and thus throttle
the clients from pushing more data, but perhaps there are enough client
threads trying to write that their maximum buffer spaces combined would
overwhelm the particular server.
In short: when reproducing the bug, try something like "vmstat 1" in a
separate SSH shell, to see if your available memory plummets when you
disconnect the devices and/or the "sr" (scanrate, search for swapping)
increases substantially.
HTH,
//Jim Klimov
_______________________________________________
OmniOS-discuss mailing list
[email protected]
http://lists.omniti.com/mailman/listinfo/omnios-discuss