Re: [zfs-discuss] Drive i/o anomaly

2011-02-10 Thread Roy Sigurd Karlsbakk
> matt@vault:~$ zpool status
> pool: rpool
> state: ONLINE
> scan: resilvered 588M in 0h3m with 0 errors on Fri Jan 7 07:38:06 2011
> config:
> 
> NAME STATE READ WRITE CKSUM
> rpool ONLINE 0 0 0
> mirror-0 ONLINE 0 0 0
> c8t1d0s0 ONLINE 0 0 0
> c8t0d0s0 ONLINE 0 0 0
> cache
> c12d0s0 ONLINE 0 0 0
> 
> errors: No known data errors

> The stall occurs when the drive c8t1d0 is 100% waiting, and doing only
> slow i/o, typically writing about 2MB/s. However, the other drive is
> all zeros... doing nothing.
> 
> The drives are:
> c8t0d0 - Western Digital Green -
> SATA_WDC_WD15EARS-00Z_WD-WMAVU2582242
> c8t1d0 - Samsung Silencer -
> SATA_SAMSUNG_HD154UI___S1XWJDWZ309550
> 
> I've installed smartmon and done a short and long test on both drives,
> all resulting in no found errors.

Just a hunch, but try to run iostat -e or -E to see the error statistics. 
http://karlsbakk.net/iostat-overview.sh will give you a nice overview if you're 
on a wide terminal. The times I've seen one drive slow down a pool, it has 
always been because of errors on that drive.

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Drive i/o anomaly

2011-02-09 Thread Richard Elling
On Feb 9, 2011, at 2:51 AM, Matt Connolly wrote:

> Thanks Richard - interesting...
> 
> The c8 controller is the motherboard SATA controller on an Intel D510 
> motherboard.
> 
> I've read over the man page for iostat again, and I don't see anything in 
> there that makes a distinction between the controller and the device.
> 
> If it is the controller, would it make sense that the problem affects only 
> one drive and not the other? It still smells of a drive issue to me. 

It might be a drive issue. It might be a controller issue. It might be a cable 
issue. The
reason for the slowness is not necessarily visible to the OS, beyond the two 
queues
shown in iostat.

> Since the controller is on the motherboard and difficult to replace, I'll 
> replace the drive shortly and see how it goes.

The hardware might be fine. At this point, given the data you've shared, it is 
not possible 
to identify the root cause. We can only show where the slowdown is and you can 
look
more closely at the suspect components. Lately, I've spent a lot of time with 
LSIutil and
I am really impressed with the ability to identify hardware issues on all data 
paths. Is there
a similar utility for Intel controllers?

> Nonetheless,  I still find it odd that the whole io system effectively hangs 
> up when one drive's queue fills up. Since the purpose of a mirror is to 
> continue operating in the case of one drive's failure, I find it frustrating 
> that the system slows right down so much because one drive's i/o queue is 
> full.

Slow != failed, for some definition of slow.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Drive i/o anomaly

2011-02-09 Thread David Dyer-Bennet

On Wed, February 9, 2011 04:51, Matt Connolly wrote:

> Nonetheless,  I still find it odd that the whole io system effectively
> hangs up when one drive's queue fills up. Since the purpose of a mirror is
> to continue operating in the case of one drive's failure, I find it
> frustrating that the system slows right down so much because one drive's
> i/o queue is full.

I see what you're saying.  But I don't think mirror systems really try to
handle asymmetric performance.  They either treat the drives equivalently,
or else they decide one of them is "broken" and don't use it at all.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Drive i/o anomaly

2011-02-09 Thread Matt Connolly
Thanks Richard - interesting...

The c8 controller is the motherboard SATA controller on an Intel D510 
motherboard.

I've read over the man page for iostat again, and I don't see anything in there 
that makes a distinction between the controller and the device.

If it is the controller, would it make sense that the problem affects only one 
drive and not the other? It still smells of a drive issue to me. 

Since the controller is on the motherboard and difficult to replace, I'll 
replace the drive shortly and see how it goes.

Nonetheless,  I still find it odd that the whole io system effectively hangs up 
when one drive's queue fills up. Since the purpose of a mirror is to continue 
operating in the case of one drive's failure, I find it frustrating that the 
system slows right down so much because one drive's i/o queue is full.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Drive i/o anomaly

2011-02-08 Thread a . smith

It is a 4k sector drive, but I thought zfs recognised those drives and didn't
need any special configuration...?


4k drives are a big problem for ZFS, much has been posted/written  
about it. Basically, if the 4k drives report 512 byte blocks, as they  
almost all do, then ZFS does not detect and configure the pool  
correctly. If the drive actually reports the real 4k block size, ZFS  
handles this very nicely.
So the problem/fault is drives misreporting the real block size, to  
maintain compatibility with other OS's etc, and not really with ZFS.


cheers Andy.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Drive i/o anomaly

2011-02-07 Thread Richard Elling
Observation below...

On Feb 4, 2011, at 7:10 PM, Matt Connolly wrote:

> Hi, I have a low-power server with three drives in it, like so:
> 
> 
> matt@vault:~$ zpool status
>  pool: rpool
> state: ONLINE
> scan: resilvered 588M in 0h3m with 0 errors on Fri Jan  7 07:38:06 2011
> config:
> 
>NAME  STATE READ WRITE CKSUM
>rpool ONLINE   0 0 0
>  mirror-0ONLINE   0 0 0
>c8t1d0s0  ONLINE   0 0 0
>c8t0d0s0  ONLINE   0 0 0
>cache
>  c12d0s0 ONLINE   0 0 0
> 
> errors: No known data errors
> 
> 
> I'm running netatalk file sharing for mac, and using it as a time machine 
> backup server for my mac laptop.
> 
> When files are copying to the server, I often see periods of a minute or so 
> where network traffic stops. I'm convinced that there's some bottleneck in 
> the storage side of things because when this happens, I can still ping the 
> machine and if I have an ssh window, open, I can still see output from a 
> `top` command running smoothly. However, if I try and do anything that 
> touches disk (eg `ls`) that command stalls. At the time it comes good, 
> everything comes good, file copies across the network continue, etc.
> 
> If I have a ssh terminal session open and run `iostat -nv 5` I see something 
> like this:
> 
> 
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>1.2   36.0  153.6 4608.0  1.2  0.3   31.99.3  16  18 c12d0
>0.0  113.40.0 7446.7  0.8  0.17.00.5  15   5 c8t0d0
>0.2  106.44.1 7427.8  4.0  0.1   37.81.4  93  14 c8t1d0
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>0.4   73.2   25.7 9243.0  2.3  0.7   31.69.8  34  37 c12d0
>0.0  226.60.0 24860.5  1.6  0.27.00.9  25  19 c8t0d0
>0.2  127.63.4 12377.6  3.8  0.3   29.72.2  91  27 c8t1d0
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>0.0   44.20.0 5657.6  1.4  0.4   31.79.0  19  20 c12d0
>0.2   76.04.8 9420.8  1.1  0.1   14.21.7  12  13 c8t0d0
>0.0   16.60.0 2058.4  9.0  1.0  542.1   60.2 100 100 c8t1d0
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>0.00.20.0   25.6  0.0  0.00.32.3   0   0 c12d0
>0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
>0.0   11.00.0 1365.6  9.0  1.0  818.1   90.9 100 100 c8t1d0
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>0.20.00.10.0  0.0  0.00.1   25.4   0   1 c12d0
>0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
>0.0   17.60.0 2182.4  9.0  1.0  511.3   56.8 100 100 c8t1d0
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>0.00.00.00.0  0.0  0.00.00.0   0   0 c12d0
>0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
>0.0   16.60.0 2058.4  9.0  1.0  542.1   60.2 100 100 c8t1d0
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>0.00.00.00.0  0.0  0.00.00.0   0   0 c12d0
>0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
>0.0   15.80.0 1959.2  9.0  1.0  569.6   63.3 100 100 c8t1d0
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>0.20.00.10.0  0.0  0.00.10.1   0   0 c12d0
>0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
>0.0   17.40.0 2157.6  9.0  1.0  517.2   57.4 100 100 c8t1d0
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>0.00.00.00.0  0.0  0.00.00.0   0   0 c12d0
>0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
>0.0   18.20.0 2256.8  9.0  1.0  494.5   54.9 100 100 c8t1d0
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>0.00.00.00.0  0.0  0.00.00.0   0   0 c12d0
>0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
>0.0   14.80.0 1835.2  9.0  1.0  608.1   67.5 100 100 c8t1d0
>extended device statistics  
>r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>0.20.00.10.0  0.0  0.00.10.1   0   0 c12d0
>0.01.40.00.6  0.0 

Re: [zfs-discuss] Drive i/o anomaly

2011-02-07 Thread Marion Hakanson
matt.connolly...@gmail.com said:
> After putting the drive online (and letting the resilver complete) I took the
> slow drive (c8t1d0 western digital green) offline and the system ran very
> nicely.
> 
> It is a 4k sector drive, but I thought zfs recognised those drives and didn't
> need any special configuration...? 

That's a nice confirmation of the cost of not doing anything special (:-).

I hear the problem may be due to 4k drives which report themselves as 512b
drives, for boot/BIOS compatibility reasons.  I've also seen various ways
to force 4k alignment, and check what the "ashift" value is in your pool's
drives, etc.  Google "solaris zfs 4k sector align" will lead the way.

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Drive i/o anomaly

2011-02-07 Thread Matt Connolly
Thanks, Marion.

(I actually got the drive labels mixed up in the original post... I edited it 
on the forum page: 
http://opensolaris.org/jive/thread.jspa?messageID=511057#511057 )

My suspicion was the same: the drive doing the slow i/o is the problem.

I managed to confirm that by taking the other drive offline (c8t0d0 samsung), 
and the same stalls and slow i/o occurred.

After putting the drive online (and letting the resilver complete) I took the 
slow drive (c8t1d0 western digital green) offline and the system ran very 
nicely.

It is a 4k sector drive, but I thought zfs recognised those drives and didn't 
need any special configuration...?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Drive i/o anomaly

2011-02-07 Thread Marion Hakanson
matt.connolly...@gmail.com said:
> extended device statistics  
> r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
> 1.2   36.0  153.6 4608.0  1.2  0.3   31.99.3  16  18 c12d0
> 0.0  113.40.0 7446.7  0.8  0.17.00.5  15   5 c8t0d0
> 0.2  106.44.1 7427.8  4.0  0.1   37.81.4  93  14 c8t1d0
> extended device statistics  
> r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
> 0.4   73.2   25.7 9243.0  2.3  0.7   31.69.8  34  37 c12d0
> 0.0  226.60.0 24860.5  1.6  0.27.00.9  25  19 c8t0d0
> 0.2  127.63.4 12377.6  3.8  0.3   29.72.2  91  27 c8t1d0
> extended device statistics  
> r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
> 0.0   44.20.0 5657.6  1.4  0.4   31.79.0  19  20 c12d0
> 0.2   76.04.8 9420.8  1.1  0.1   14.21.7  12  13 c8t0d0
> 0.0   16.60.0 2058.4  9.0  1.0  542.1   60.2 100 100 c8t1d0
> extended device statistics  
> r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
> 0.00.20.0   25.6  0.0  0.00.32.3   0   0 c12d0
> 0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
> 0.0   11.00.0 1365.6  9.0  1.0  818.1   90.9 100 100 c8t1d0 
> . . .

matt.connolly...@gmail.com said:
> I expect that the c8t0d0 WD Green is the lemon here and for some reason is
> getting stuck in periods where it can write no faster than about 2MB/s. Does
> this sound right? 

No, it's the opposite.  The drive sitting at 100%-busy, c8t1d0, while the
other drive is idle, is the sick one.  It's slower than the other, has 9.0
operations waiting (queued) to finish.  The other one is idle because it
has already finished the write activity and is waiting for the slow one
in the mirror to catch up.  If you run "iostat -xn" without the interval
argument, i.e. so it prints out only one set of stats, you'll see the
average performance of the drives since last reboot.  If the "asvc_t"
figure is significantly larger for one drive than the other, that's a
way to identify the one which has been slower over the long term.


> Secondly, what I wonder is why it is that the whole file system seems to hang
> up at this time. Surely if the other drive is doing nothing, a web page can
> be served by reading from the available drive (c8t1d0) while the slow drive
> (c8t0d0) is stuck writing slow. 

The available drive is c8t0d0 in this case.  However, if ZFS is in the
middle of a txg (ZFS transaction) commit, it cannot safely do much with
the pool until that commit finishes.  You can see that ZFS only lets 10
operations accumulate per drive (used to be 35), i.e. 9.0 in the "wait"
column, and 1.0 in the "actv" column, so it's kinda stuck until the
drive gets its work done.

Maybe the drive is failing, or maybe it's one of those with large sectors
that are not properly aligned with the on-disk partitions.

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Drive i/o anomaly

2011-02-07 Thread Matt Connolly
Hi, I have a low-power server with three drives in it, like so:


matt@vault:~$ zpool status
  pool: rpool
 state: ONLINE
 scan: resilvered 588M in 0h3m with 0 errors on Fri Jan  7 07:38:06 2011
config:

NAME  STATE READ WRITE CKSUM
rpool ONLINE   0 0 0
  mirror-0ONLINE   0 0 0
c8t1d0s0  ONLINE   0 0 0
c8t0d0s0  ONLINE   0 0 0
cache
  c12d0s0 ONLINE   0 0 0

errors: No known data errors


I'm running netatalk file sharing for mac, and using it as a time machine 
backup server for my mac laptop.

When files are copying to the server, I often see periods of a minute or so 
where network traffic stops. I'm convinced that there's some bottleneck in the 
storage side of things because when this happens, I can still ping the machine 
and if I have an ssh window, open, I can still see output from a `top` command 
running smoothly. However, if I try and do anything that touches disk (eg `ls`) 
that command stalls. At the time it comes good, everything comes good, file 
copies across the network continue, etc.

If I have a ssh terminal session open and run `iostat -nv 5` I see something 
like this:


extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
1.2   36.0  153.6 4608.0  1.2  0.3   31.99.3  16  18 c12d0
0.0  113.40.0 7446.7  0.8  0.17.00.5  15   5 c8t0d0
0.2  106.44.1 7427.8  4.0  0.1   37.81.4  93  14 c8t1d0
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.4   73.2   25.7 9243.0  2.3  0.7   31.69.8  34  37 c12d0
0.0  226.60.0 24860.5  1.6  0.27.00.9  25  19 c8t0d0
0.2  127.63.4 12377.6  3.8  0.3   29.72.2  91  27 c8t1d0
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.0   44.20.0 5657.6  1.4  0.4   31.79.0  19  20 c12d0
0.2   76.04.8 9420.8  1.1  0.1   14.21.7  12  13 c8t0d0
0.0   16.60.0 2058.4  9.0  1.0  542.1   60.2 100 100 c8t1d0
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.00.20.0   25.6  0.0  0.00.32.3   0   0 c12d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
0.0   11.00.0 1365.6  9.0  1.0  818.1   90.9 100 100 c8t1d0
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.20.00.10.0  0.0  0.00.1   25.4   0   1 c12d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
0.0   17.60.0 2182.4  9.0  1.0  511.3   56.8 100 100 c8t1d0
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.00.00.00.0  0.0  0.00.00.0   0   0 c12d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
0.0   16.60.0 2058.4  9.0  1.0  542.1   60.2 100 100 c8t1d0
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.00.00.00.0  0.0  0.00.00.0   0   0 c12d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
0.0   15.80.0 1959.2  9.0  1.0  569.6   63.3 100 100 c8t1d0
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.20.00.10.0  0.0  0.00.10.1   0   0 c12d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
0.0   17.40.0 2157.6  9.0  1.0  517.2   57.4 100 100 c8t1d0
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.00.00.00.0  0.0  0.00.00.0   0   0 c12d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
0.0   18.20.0 2256.8  9.0  1.0  494.5   54.9 100 100 c8t1d0
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.00.00.00.0  0.0  0.00.00.0   0   0 c12d0
0.00.00.00.0  0.0  0.00.00.0   0   0 c8t0d0
0.0   14.80.0 1835.2  9.0  1.0  608.1   67.5 100 100 c8t1d0
extended device statistics  
r/sw/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
0.20.00.10.0  0.0  0.00.10.1   0   0 c12d0
0.01.40.00.6  0.0  0.00.00.2   0   0 c8t0d0
0.0   49.00.0 6049.6  6.7  0.5  137.6   11.2 100  55 c8t1d0
extended device statistics  
r/sw/s   kr/s