Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-18 Thread Donald Stahl
Wow- so a bit of an update: With the default scrub delay: echo zfs_scrub_delay/K | mdb -kw zfs_scrub_delay:20004 pool0 14.1T 25.3T165499 1.28M 2.88M pool0 14.1T 25.3T146 0 1.13M 0 pool0 14.1T 25.3T147 0 1.14M 0 pool0 14.1T

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-18 Thread George Wilson
Don, Try setting the zfs_scrub_delay to 1 but increase the zfs_top_maxinflight to something like 64. Thanks, George On Wed, May 18, 2011 at 5:48 PM, Donald Stahl d...@blacksun.org wrote: Wow- so a bit of an update: With the default scrub delay: echo zfs_scrub_delay/K | mdb -kw

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-18 Thread Donald Stahl
Try setting the zfs_scrub_delay to 1 but increase the zfs_top_maxinflight to something like 64. The array is running some regression tests right now but when it quiets down I'll try that change. -Don ___ zfs-discuss mailing list

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-18 Thread Donald Stahl
Try setting the zfs_scrub_delay to 1 but increase the zfs_top_maxinflight to something like 64. With the delay set to 1 or higher it doesn't matter what I set the maxinflight value to- when I check with: echo ::walk spa | ::print spa_t spa_name spa_last_io spa_scrub_inflight The value returned

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-17 Thread Jim Klimov
2011-05-17 6:32, Donald Stahl пишет: I have two follow up questions: 1. We changed the metaslab size from 10M to 4k- that's a pretty drastic change. Is there some median value that should be used instead and/or is there a downside to using such a small metaslab size? 2. I'm still confused by

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-17 Thread Richard Elling
On May 16, 2011, at 7:32 PM, Donald Stahl wrote: As a followup: I ran the same DD test as earlier- but this time I stopped the scrub: pool0 14.1T 25.4T 88 4.81K 709K 262M pool0 14.1T 25.4T104 3.99K 836K 248M pool0 14.1T 25.4T360 5.01K 2.81M

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-17 Thread Donald Stahl
metaslab_min_alloc_size is not the metaslab size. From the source Sorry- that was simply a slip of the mind- it was a long day. By reducing this value, it is easier for the allocator to identify a metaslab for allocation as the file system becomes full. Thank you for clarifying. Is there a

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-17 Thread George Wilson
On Mon, May 16, 2011 at 7:32 PM, Donald Stahl d...@blacksun.org wrote: As a followup: I ran the same DD test as earlier- but this time I stopped the scrub: pool0       14.1T  25.4T     88  4.81K   709K   262M pool0       14.1T  25.4T    104  3.99K   836K   248M pool0       14.1T  25.4T    

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-17 Thread George Wilson
On Tue, May 17, 2011 at 6:49 AM, Jim Klimov jimkli...@cos.ru wrote: 2011-05-17 6:32, Donald Stahl пишет: I have two follow up questions: 1. We changed the metaslab size from 10M to 4k- that's a pretty drastic change. Is there some median value that should be used instead and/or is there a

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-17 Thread Jim Klimov
So if you bump this to 32k then the fragmented size is 512k which tells ZFS to switch to a different metaslab once it drops below this threshold. Makes sense after some more reading today ;) What happens if no metaslab has a block this large (or small) on a sufficiently full and fragmented

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-17 Thread George Wilson
On Tue, May 17, 2011 at 11:48 AM, Jim Klimov j...@cos.ru wrote: So if you bump this to 32k then the fragmented size is 512k which tells ZFS to switch to a different metaslab  once it drops below this threshold. Makes sense after some more reading today ;) What happens if no metaslab has a

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread George Wilson
Can you share your 'zpool status' output for both pools? Also you may want to run the following a few times in a loop and provide the output: # echo ::walk spa | ::print spa_t spa_name spa_last_io spa_scrub_inflight | mdb -k Thanks, George On Sat, May 14, 2011 at 8:29 AM, Donald Stahl

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread Donald Stahl
Can you share your 'zpool status' output for both pools? Faster, smaller server: ~# zpool status pool0 pool: pool0 state: ONLINE scan: scrub repaired 0 in 2h18m with 0 errors on Sat May 14 13:28:58 2011 Much larger, more capable server: ~# zpool status pool0 | head pool: pool0 state: ONLINE

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread George Wilson
Don, Can you send the entire 'zpool status' output? I wanted to see your pool configuration. Also run the mdb command in a loop (at least 5 tiimes) so we can see if spa_last_io is changing. I'm surprised you're not finding the symbol for 'spa_scrub_inflight' too. Can you check that you didn't

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread Donald Stahl
Can you send the entire 'zpool status' output? I wanted to see your pool configuration. Also run the mdb command in a loop (at least 5 tiimes) so we can see if spa_last_io is changing. I'm surprised you're not finding the symbol for 'spa_scrub_inflight' too.  Can you check that you didn't

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread Donald Stahl
I copy and pasted to make sure that wasn't the issue :) Which, ironically, turned out to be the problem- there was an extra carriage return in there that mdb did not like: Here is the output: spa_name = [ pool0 ] spa_last_io = 0x82721a4 spa_scrub_inflight = 0x1 spa_name = [ pool0 ] spa_last_io

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread Donald Stahl
Here is another example of the performance problems I am seeing: ~# dd if=/dev/zero of=/pool0/ds.test bs=1024k count=2000 2000+0 records in 2000+0 records out 2097152000 bytes (2.1 GB) copied, 56.2184 s, 37.3 MB/s 37MB/s seems like some sort of bad joke for all these disks. I can write the same

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread George Wilson
You mentioned that the pool was somewhat full, can you send the output of 'zpool iostat -v pool0'? You can also try doing the following to reduce 'metaslab_min_alloc_size' to 4K: echo metaslab_min_alloc_size/Z 1000 | mdb -kw NOTE: This will change the running system so you may want to make this

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread Donald Stahl
You mentioned that the pool was somewhat full, can you send the output of 'zpool iostat -v pool0'? ~# zpool iostat -v pool0 capacity operationsbandwidth poolalloc free read write read write -- - - - -

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread Donald Stahl
You mentioned that the pool was somewhat full, can you send the output of 'zpool iostat -v pool0'? You can also try doing the following to reduce 'metaslab_min_alloc_size' to 4K: echo metaslab_min_alloc_size/Z 1000 | mdb -kw So just changing that setting moved my write rate from 40MB/s to

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread Roy Sigurd Karlsbakk
Running a zpool scrub on our production pool is showing a scrub rate of about 400K/s. (When this pool was first set up we saw rates in the MB/s range during a scrub). Usually, something like this is caused by a bad drive. Can you post iostat -en output? Vennlige hilsener / Best regards roy

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread Jim Klimov
2011-05-16 22:21, George Wilson пишет: echo metaslab_min_alloc_size/Z 1000 | mdb -kw Thanks, this also boosted my home box from hundreds of kb/s into several Mb/s range, which is much better (I'm evacuating data from a pool hosted in a volume inside my main pool, and the bottleneck is quite

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-16 Thread Donald Stahl
As a followup: I ran the same DD test as earlier- but this time I stopped the scrub: pool0 14.1T 25.4T 88 4.81K 709K 262M pool0 14.1T 25.4T104 3.99K 836K 248M pool0 14.1T 25.4T360 5.01K 2.81M 230M pool0 14.1T 25.4T305 5.69K 2.38M 231M

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-14 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Donald Stahl Running a zpool scrub on our production pool is showing a scrub rate of about 400K/s. (When this pool was first set up we saw rates in the MB/s range during a scrub). Wait

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-14 Thread Andrew Gabriel
On 05/14/11 01:08 PM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Donald Stahl Running a zpool scrub on our production pool is showing a scrub rate of about 400K/s. (When this pool was first set up we saw rates

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-14 Thread Richard Elling
On May 13, 2011, at 11:25 AM, Donald Stahl d...@blacksun.org wrote: Running a zpool scrub on our production pool is showing a scrub rate of about 400K/s. (When this pool was first set up we saw rates in the MB/s range during a scrub). The scrub I/O has lower priority than other I/O. In

Re: [zfs-discuss] Extremely slow zpool scrub performance

2011-05-14 Thread Donald Stahl
The scrub I/O has lower priority than other I/O. In later ZFS releases, scrub I/O is also throttled. When the throttle kicks in, the scrub can drop to 5-10 IOPS. This shouldn't be much of an issue, scrubs do not need to be, and are not intended to be, run very often -- perhaps once a quarter