[zfs-discuss] Pathological ZFS performance

2007-03-30 Thread Robert Thurlow

Hi folks,

In some prior posts, I've talked about trying to get four IDE drives in
a Firewire case working.  Yesterday, I bailed out due to hangs, filed
6539587, and moved the drives inside my Opteron box, hanging off one
of these:

http://www.newegg.com/Product/Product.asp?Item=N82E16816124001

Now, I know the single IDE controller is going to be a bottleneck,
but let me tell you how slow it is.  Last night, after moving the
drives, I started a scrub.  It's still running.  At 20 hours, I
was up to 57.75%, and had 14.5 hours left.  That's for a pool
that's 341Gb, so the scrub has to access 341 * 4 / 3, or 454Gb.
I make that something under 4 Megabytes per second.  That's not
just slow, it's pathological.  I was easily getting 10-20X from
this pool in the external case.

Where do I get started at diagnosing this?  I don't know much
about watching I/O throughput.

For yet-another-fallback, I am thinking about using SATA-to-IDE
converters here:

http://www.newegg.com/product/product.asp?item=N82E16812156010

It feels kind of nuts, but I have to think this would perform
better than what I have now.  This would cost me the one SATA
drive I'm using now in a smaller pool.

Rob T
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pathological ZFS performance

2007-03-30 Thread Constantin Gonzalez
Hi,

If all of the four drives really hang off of one ATA-Controller, they
may all be stepping on each other's feet as it is a blocking bus protocol.
There's a speed discussion in the ATA article on Wikipedia:

  http://en.wikipedia.org/wiki/AT_Attachment

My current machine (W1100z) has 4 internal PATA drives that run off of 2
controllers and I have a peculiar mixture of RAID-Z and mirroring over
different slices on those drives. Performance is ok for everyday use and
a zpool scrub of approx. 266 GB seems to be in the order of 5 hours:

# zfs list pelotillehue
NAME   USED  AVAIL  REFER  MOUNTPOINT
pelotillehue   266G  55.6G  28.5K  /pelotillehue
# zpool status pelotillehue
  pool: pelotillehue
 state: ONLINE
 scrub: scrub in progress, 3.69% done, 5h1m to go
config:

NAMESTATE READ WRITE CKSUM
pelotillehue  ONLINE   0 0 0
  mirrorONLINE   0 0 0
c0d1s5  ONLINE   0 0 0
c1d0s5  ONLINE   0 0 0
  raidz1ONLINE   0 0 0
c0d0s3  ONLINE   0 0 0
c0d1s3  ONLINE   0 0 0
c1d0s3  ONLINE   0 0 0
c1d1s3  ONLINE   0 0 0
  raidz1ONLINE   0 0 0
c0d1s4  ONLINE   0 0 0
c1d0s4  ONLINE   0 0 0
c1d1s4  ONLINE   0 0 0

errors: No known data errors


Since this is a home server which is constrained by 100 MBit Ethernet
anyway, I don't care about performance. My iTunes songs and MPEG-2
digital TV recordings/playback as well as PowerBook/iBook laptops run
off of this pool sometimes at the same time with no problems.

Knowing about current problems with Firewire (see also other firewire
bugs in Artem's blog) and in search of a better solution, I decided to move
my pool to external USB storage.

USB 2.0 in practice gives you about the same performance as the network
(I'm being conservative here), so the speed is acceptable for me.

So in the future, I'm going to migrate to pairs of cheap external USB 2.0
disks which seem to function ok. Performance should be on par with the
network, especially since mirroring and striping across multiple mirrors
will tend to increase performance.

The main advantage I see here is manageability: Just plug and unplug disks
as needed or whenever they need to be upgraded or replace, without tinkering
inside your machine and while it is still running.

As for diagnosing your performance, I would start by monitoring device
performance and throughput. You can do that using the DTrace toolkit quite
easily, or even graphically with the Chime visual DTrace Tool:

  http://www.opensolaris.org/os/community/dtrace/dtracetoolkit/
  http://www.opensolaris.org/os/project/dtrace-chime/

This is what iotop 10 says while my system is scrubbing the above pool:

2007 Mar 30 14:59:41,  load: 0.15,  disk_r: 369502 KB,  disk_w:  0 KB

  UIDPID   PPID CMD  DEVICE  MAJ MIN DBYTES
0  0  0 schedcmdk3   102 196 R   131072
0  0  0 schedcmdk1   102  68 R   134144
0  0  0 schedcmdk2   102 132 R   134656
0  0  0 schedcmdk3   102 195 R 31796224
0  0  0 schedcmdk0   102   3 R 31860224
0  0  0 schedcmdk2   102 131 R 31921664
0  0  0 schedcmdk1   102  67 R 35162112
0  0  0 schedcmdk1   102  69 R118307328
0  0  0 schedcmdk2   102 133 R128397312

So, I seem to be actually getting about 350MB per 10 seconds, which is about
35 MB/s for 2 UltraATA/100 controllers and 4 disks.

If you see something similar, this is probably expected.

Hope this helps,
   Constantin

Robert Thurlow wrote:
 Hi folks,
 
 In some prior posts, I've talked about trying to get four IDE drives in
 a Firewire case working.  Yesterday, I bailed out due to hangs, filed
 6539587, and moved the drives inside my Opteron box, hanging off one
 of these:
 
 http://www.newegg.com/Product/Product.asp?Item=N82E16816124001
 
 Now, I know the single IDE controller is going to be a bottleneck,
 but let me tell you how slow it is.  Last night, after moving the
 drives, I started a scrub.  It's still running.  At 20 hours, I
 was up to 57.75%, and had 14.5 hours left.  That's for a pool
 that's 341Gb, so the scrub has to access 341 * 4 / 3, or 454Gb.
 I make that something under 4 Megabytes per second.  That's not
 just slow, it's pathological.  I was easily getting 10-20X from
 this pool in the external case.
 
 Where do I get started at diagnosing this?  I don't know much
 about watching I/O throughput.
 
 For yet-another-fallback, I am thinking about using SATA-to-IDE
 converters here:
 
 

Re: [zfs-discuss] Pathological ZFS performance

2007-03-30 Thread Robert Thurlow

Bill Sommerfeld wrote:

On Fri, 2007-03-30 at 06:12 -0600, Robert Thurlow wrote:

Last night, after moving the
drives, I started a scrub.  It's still running.  At 20 hours, I
was up to 57.75%, and had 14.5 hours left. 


Do you have any cron jobs which are creating periodic snapshots?


No, not in this case.  I remember that bug.  My scrub made linear
progress the whole way.

RT
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss