Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-05 Thread Chris Baker
Sanjeev Thanks for taking an interest. Unfortunately I did have failmode=continue, but I have just destroyed/recreated and double confirmed and got exactly the same results. zpool status shows both drives mirror, ONLINE, no errors dmesg shows: SATA device detached at port 0 cfgadm shows:

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-05 Thread Ross
Just a thought, but how long have you left it? I had problems with a failing drive a while back which did eventually get taken offline, but took about 20 minutes to do so. -- This message posted from opensolaris.org ___ zfs-discuss mailing list

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-05 Thread Chris Baker
I've left it hanging about 2 hours. I've also just learned that whatever the issue is it is also blocking an init 5 shutdown. I was thinking about setting a watchdog with a forced reboot but that will get me nowhere if I need I reset button restart. Thanks for the advice re the LSI 1068, not

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-05 Thread Ross
Yeah, sounds just like the issues I've seen before. I don't think you're likely to see a fix anytime soon, but the good news is that so far I've not seen anybody reporting problems with LSI 1068 based cards (and I've been watching for a while). With the 1068 being used in the x4540 Thumper 2,

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-05 Thread roland
doesn´t solaris have the great builtin dtrace for issues like these ? if we knew in which syscall or kernel-thread the system is stuck, we may get a clue... unfortunately, i don´t have any real knowledge of solaris kernel internals or dtrace... -- This message posted from opensolaris.org

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-05 Thread Sanjeev
Chris, On Wed, Aug 05, 2009 at 05:33:24AM -0700, Chris Baker wrote: Sanjeev Thanks for taking an interest. Unfortunately I did have failmode=continue, but I have just destroyed/recreated and double confirmed and got exactly the same results. zpool status shows both drives mirror,

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-04 Thread Ross
What version of Solaris / OpenSolaris are you running there? I remember zfs commands locking up being a big problem a while ago, but I thought they'd managed to solve the issues like this. -- This message posted from opensolaris.org ___ zfs-discuss

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-04 Thread Chris Baker
Apologies - I'm daft for not saying originally: OpenSolaris 2009.06 on x86 Cheers Chris -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-04 Thread roland
what exact type of sata controller do you use? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-04 Thread Chris Baker
It's a generic Sil3132 based PCIe x1 card using the si3124 driver. Prior to this I had been using Intel ICH10R with AHCI but I have found the Sil3132 actually hot plugs a little smoother than the Intel chipset. I have not gone back to recheck this specific problem on the ICH10R (though I can),

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-04 Thread Chris Baker
Ok - in an attempt to weasel my way past the issue I mirrored my problematic si3124 drive to a second drive on the ICH10R, started writing to the file system and then killed the power to the si3124 removable drive. To my (unfortunate) surprise, the IO stream that was writing to the mirrored

Re: [zfs-discuss] Recovering from ZFS command lock up after yanking a non-redundant drive?

2009-08-04 Thread Ross
Whether ZFS properly detects device removal depends to a large extent on the device drivers for the controller. I personally have stuck to using controllers with chipsets I know Sun use on their own servers, but even then I've been bitten by similar problems to yours on the AOC-SAT2-MV8 cards.