Re: [zfs-discuss] ZFS Mirrors braindead?

2008-10-07 Thread Ross Smith

Oh cool, that's great news.  Thanks Eric.



> Date: Tue, 7 Oct 2008 11:50:08 -0700
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> CC: zfs-discuss@opensolaris.org
> Subject: Re: [zfs-discuss] ZFS Mirrors braindead?
> 
> On Tue, Oct 07, 2008 at 11:42:57AM -0700, Ross wrote:
>> 
>> Running "zpool status" is a complete no no if your array is degraded
>> in any way.  This is capable of locking up zfs even when it would
>> otherwise have recovered itself.  If you had zpool status hang, this
>> probably happened to you.
> 
> FYI, this is bug 6667208 fixed in build 100 of nevada.
> 
> - Eric
> 
> --
> Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock

_
Discover Bird's Eye View now with Multimap from Live Search
http://clk.atdmt.com/UKM/go/111354026/direct/01/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirrors braindead?

2008-10-07 Thread Eric Schrock
On Tue, Oct 07, 2008 at 11:42:57AM -0700, Ross wrote:
> 
> Running "zpool status" is a complete no no if your array is degraded
> in any way.  This is capable of locking up zfs even when it would
> otherwise have recovered itself.  If you had zpool status hang, this
> probably happened to you.

FYI, this is bug 6667208 fixed in build 100 of nevada.

- Eric

--
Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirrors braindead?

2008-10-07 Thread Carson Gaspar
kristof wrote:
> I don't know if this is already available in S10 10/08, but in opensolaris 
> build>  71 you can set the:
>
> zpool failmode property
>
> see:
> http://opensolaris.org/os/community/arc/caselog/2007/567/
>
> available options are:
>
>   The property can be set to one of three options: "wait", "continue",
> or "panic".

I'm fairly certain that this isn't what the OP was concerned about.

The OP appeared to be concerned about ZFS's behaviour when one half of a 
mirror went away. As the pool is merely degraded, ZFS will continue to 
allow reads and writes... eventually...

Depending on _how_ the disk is failing, I/O may become glacial, or 
freeze entirely for several minutes before recovering, or hiccup briefly 
and then go on normally. ZFS is layered to the point where stacked 
timeouts _may_ become unreasonably large (see many previous threads). 
And a single "slow" device will drag the rest of the volume with it 
(e.g. a disk that demands 10 retries per write).

SVM suffers from some of the same problems, although not (in my 
experience) to the same degree. SVM tends to err on the side of "fail 
the disk quickly", whereas ZFS tries very very hard to make all I/O 
succeed, and relies on the fault management system or I/O stack to 
decide to fail things.

-- 
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirrors braindead?

2008-10-07 Thread Wade . Stuart

[EMAIL PROTECTED] wrote on 10/07/2008 01:10:51 PM:

> I don't know if this is already available in S10 10/08, but in
> opensolaris build > 71 you can set the:
>
> zpool failmode property
>
> see:
> http://opensolaris.org/os/community/arc/caselog/2007/567/
>
> available options are:
>
>  The property can be set to one of three options: "wait", "continue",
> or "panic".
>
> The default behavior will be to "wait" for manual intervention before
> allowing any further I/O attempts. Any I/O that was already queued would
> remain in memory until the condition is resolved. This error condition
can
> be cleared by using the 'zpool clear' subcommand, which will attemptto
resume
> any queued I/Os.
>
> The "continue" mode returns EIO to any new write request but attempts to
> satisfy reads. Any write I/Os that were already in-flight at the time
> of the failure will be queued and maybe resumed using 'zpool clear'.
>
> Finally, the "panic" mode provides the existing behavior that was
explained
> above.

Huh?  I was under the impression that this was for catastrophic write
issues (no paths to storage at all) not just one side of a mirror being
down?  I run mostly zraid2 and have not tested mirror breakage but am I
wrong in assuming that like any other mirroring system (hw or software)
when you lose one side of a mirror for writes that the expected result is
the filesystem stays online and error free while the disk(s) in question
are marked as down/failed/offline?

-Wade

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirrors braindead?

2008-10-07 Thread Ross
As far as I can tell, it all comes down to whether ZFS detects the failure 
properly, and what commands you use as it's recovering.

Running "zpool status" is a complete no no if your array is degraded in any 
way.  This is capable of locking up zfs even when it would otherwise have 
recovered itself.  If you had zpool status hang, this probably happened to you.

It also appears that ZFS is at the mercy of your drivers when it comes to 
detecting and reacting to the failure.  From my experience this means that when 
a device does fail, ZFS may react instantly and keep your mirror online, it may 
take 3 minutes (waiting for iSCSI to timeout), or it may take a long time (if 
FMA is involved).

I've seen ZFS mirrors protect data nicely, but I've also seen a lot of very odd 
fail modes.  I'd quite happily run ZFS in production, but you can be damn sure 
it'd be on Sun hardware, and I'd test as many fail modes as I could before it 
went live.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Mirrors braindead?

2008-10-07 Thread kristof
I don't know if this is already available in S10 10/08, but in opensolaris 
build > 71 you can set the:

zpool failmode property

see: 
http://opensolaris.org/os/community/arc/caselog/2007/567/

available options are:

 The property can be set to one of three options: "wait", "continue",
or "panic".

The default behavior will be to "wait" for manual intervention before
allowing any further I/O attempts. Any I/O that was already queued would
remain in memory until the condition is resolved. This error condition can
be cleared by using the 'zpool clear' subcommand, which will attempt to resume
any queued I/Os.

The "continue" mode returns EIO to any new write request but attempts to
satisfy reads. Any write I/Os that were already in-flight at the time
of the failure will be queued and maybe resumed using 'zpool clear'.

Finally, the "panic" mode provides the existing behavior that was explained
above.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS Mirrors braindead?

2008-10-07 Thread Matthew C Aycock
I recently ran into a problem for the second time with ZFS mirrors. I mirror 
between two different physical arrays for some of my data. One array (SE3511) 
had a catastrophic failure and was unresponsive. Thus, according to the ZFS in 
s10u3 it just basically waits for the array to come back and hangs pretty much 
all IO to the zpool. I was told by Sun service that there were enhancements in 
the upcoming S10 10/08 release that will help. 

My understanding of the code being delivered with S10 10/08 is that on 2-way 
mirrors (which is what I use) that if this same situation occurs again, ZFS 
will allow reads to happen but writes are still going to be queued until the 
other half of the mirror comes back.

Is it just me or have we gone backwards? The whole point of mirroring is so 
that if half the mirror goes we survive and can fix the problem with little to 
NO impact to the running system. Is this really true? With ZFS root also being 
available in S10 10/08 I would not want it anywhere near my root filesystem if 
this is really the behavior.

Any information would be GREATLY appreciated!

BlueUmp
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss