Re: [zfs-discuss] Scrub works in parallel?

2012-06-12 Thread Roch Bourbonnais

Scrubs are run at very low priority and yield very quickly in the presence of 
other work.
So I really would not expect to see scrub create any impact on an other type of 
storage activity.
Resilvering will more aggressively push forward on what is has to do, but 
resilvering does not need to 
read any of the  data blocks on the non-resilvering vdevs.

-r

Le 11 juin 2012 à 09:05, Jim Klimov a écrit :

 2012-06-11 5:37, Edward Ned Harvey wrote:
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Kalle Anka
 
 Assume we have 100 disks in one zpool. Assume it takes 5 hours to scrub
 one
 disk. If I scrub the zpool, how long time will it take?
 
 Will it scrub one disk at a time, so it will take 500 hours, i.e. in
 sequence, just
 serial? Or is it possible to run the scrub in parallel, so it takes 5h no
 matter
 how many disks?
 
 It will be approximately parallel, because it's actually scrubbing only the
 used blocks, and the order it scrubs in will be approximately the order they
 were written, which was intentionally parallel.
 
 What the other posters said, plus: 100 disks is quite a lot
 of contention on the bus(es), so even if it is all parallel,
 the bus and CPU bottlenecks would raise the scrubbing time
 somewhat above the single-disk scrub time.
 
 Roughly, if all else is ideal (i.e. no/few random seeks and
 a fast scrub at 100Mbps/disk), the SATA3 interface at 6Gbit/s
 (on the order of ~600Mbyte/s) will be maxed out at about
 6 disks. If your disks are colocated on one HBA receptacle
 (i.e. via a backplane), this may be an issue for many disks
 in an enclosure (a 4-lane link will sustain about 24 drives
 at such speed, and that's not the market's max speed).
 
 Further on, the PCI buses will become a bottleneck and the
 CPU processing power might become one too, and for a box
 with 100 disks this may be noticeable, depending on the other
 architectural choices, components and their specs.
 
 HTH,
 //Jim
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Scrub works in parallel?

2012-06-12 Thread Jim Klimov

2012-06-12 16:20, Roch Bourbonnais wrote:


Scrubs are run at very low priority and yield very quickly in the presence of 
other work.
So I really would not expect to see scrub create any impact on an other type of 
storage activity.
Resilvering will more aggressively push forward on what is has to do, but 
resilvering does not need to
read any of the  data blocks on the non-resilvering vdevs.


Thanks, I agree - and that's important to notice, at least
on the current versions of ZFS :)

What I meant to stress that if a scrub of one disk takes
5 hours (whichever way that measurement can be made, such
as making a 1-disk pool with same data distribution), then
there are physical reasons why a 100-disk pool probably
would take some way more than 5 hours to scrub; or at least
which bottlenecks should be paid attention to in order to
minimize such increase in scrub time.

Also, yes, presence of pool activity would likely delay
the scrub completion time, perhaps even more noticeably.

Thanks,
//Jim Klimov
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Scrub works in parallel?

2012-06-12 Thread Roch Bourbonnais

The process should be scalable.
Scrub all of the data on one disk using one disk worth of IOPS 
Scrub all of the data on N disks  using N  disk's worth of IOPS.

THat will take ~ the same total time. 
-r

Le 12 juin 2012 à 08:28, Jim Klimov a écrit :

 2012-06-12 16:20, Roch Bourbonnais wrote:
 
 Scrubs are run at very low priority and yield very quickly in the presence 
 of other work.
 So I really would not expect to see scrub create any impact on an other type 
 of storage activity.
 Resilvering will more aggressively push forward on what is has to do, but 
 resilvering does not need to
 read any of the  data blocks on the non-resilvering vdevs.
 
 Thanks, I agree - and that's important to notice, at least
 on the current versions of ZFS :)
 
 What I meant to stress that if a scrub of one disk takes
 5 hours (whichever way that measurement can be made, such
 as making a 1-disk pool with same data distribution), then
 there are physical reasons why a 100-disk pool probably
 would take some way more than 5 hours to scrub; or at least
 which bottlenecks should be paid attention to in order to
 minimize such increase in scrub time.
 
 Also, yes, presence of pool activity would likely delay
 the scrub completion time, perhaps even more noticeably.
 
 Thanks,
 //Jim Klimov
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Scrub works in parallel?

2012-06-12 Thread Jim Klimov

2012-06-12 16:45, Roch Bourbonnais wrote:


The process should be scalable.
Scrub all of the data on one disk using one disk worth of IOPS
Scrub all of the data on N disks  using N  disk's worth of IOPS.

THat will take ~ the same total time.


IF the uplink or processing power or some other bottleneck does
not limit that (i.e. a single 4-lane SAS link to the daisy-chain
of 100 or 200 disks would likely impose a bandwidth bottleneck).

I know that well-engineered servers spec'ed by a vendor/integrator
for the customer's tasks and environment, such as those from Sun,
are built to avoid such apparent bottlenecks. But people who
construct their own storage should know of (and try to avoid)
such possible problem-makers ;)

Thanks, Roch,
//Jim Klimov
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Scrub works in parallel?

2012-06-12 Thread Richard Elling
On Jun 11, 2012, at 6:05 AM, Jim Klimov wrote:

 2012-06-11 5:37, Edward Ned Harvey wrote:
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Kalle Anka
 
 Assume we have 100 disks in one zpool. Assume it takes 5 hours to scrub
 one
 disk. If I scrub the zpool, how long time will it take?
 
 Will it scrub one disk at a time, so it will take 500 hours, i.e. in
 sequence, just
 serial? Or is it possible to run the scrub in parallel, so it takes 5h no
 matter
 how many disks?
 
 It will be approximately parallel, because it's actually scrubbing only the
 used blocks, and the order it scrubs in will be approximately the order they
 were written, which was intentionally parallel.
 
 What the other posters said, plus: 100 disks is quite a lot
 of contention on the bus(es), so even if it is all parallel,
 the bus and CPU bottlenecks would raise the scrubbing time
 somewhat above the single-disk scrub time.

In general, this is not true for HDDs or modern CPUs. Modern systems
are overprovisioned for bandwidth. In fact, bandwidth has been a poor
design point for storage for a long time. Dave Patterson has some 
interesting observations on this, now 8 years dated.
http://www.ll.mit.edu/HPEC/agendas/proc04/invited/patterson_keynote.pdf

SSDs tend to be a different story, and there is some interesting work being
done in this area, both on the systems side as well as the SSD side. This is
where the fun work is progressing :-)
 -- richard

-- 

ZFS and performance consulting
http://www.RichardElling.com
















___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Scrub works in parallel?

2012-06-11 Thread Jim Klimov

2012-06-11 5:37, Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Kalle Anka

Assume we have 100 disks in one zpool. Assume it takes 5 hours to scrub

one

disk. If I scrub the zpool, how long time will it take?

Will it scrub one disk at a time, so it will take 500 hours, i.e. in

sequence, just

serial? Or is it possible to run the scrub in parallel, so it takes 5h no

matter

how many disks?


It will be approximately parallel, because it's actually scrubbing only the
used blocks, and the order it scrubs in will be approximately the order they
were written, which was intentionally parallel.


What the other posters said, plus: 100 disks is quite a lot
of contention on the bus(es), so even if it is all parallel,
the bus and CPU bottlenecks would raise the scrubbing time
somewhat above the single-disk scrub time.

Roughly, if all else is ideal (i.e. no/few random seeks and
a fast scrub at 100Mbps/disk), the SATA3 interface at 6Gbit/s
(on the order of ~600Mbyte/s) will be maxed out at about
6 disks. If your disks are colocated on one HBA receptacle
(i.e. via a backplane), this may be an issue for many disks
in an enclosure (a 4-lane link will sustain about 24 drives
at such speed, and that's not the market's max speed).

Further on, the PCI buses will become a bottleneck and the
CPU processing power might become one too, and for a box
with 100 disks this may be noticeable, depending on the other
architectural choices, components and their specs.

HTH,
//Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Scrub works in parallel?

2012-06-10 Thread Tomas Forsman
On 10 June, 2012 - Kalle Anka sent me these 1,5K bytes:

 Assume we have 100 disks in one zpool. Assume it takes 5 hours to
 scrub one disk. If I scrub the zpool, how long time will it take? 
 
 
 Will it scrub one disk at a time, so it will take 500 hours, i.e. in
 sequence, just serial? Or is it possible to run the scrub in parallel,
 so it takes 5h no matter how many disks?

It walks the filesystem/pool trees, so it's not just reading the disk
from track 0 to track 12345, but validates all possible copies.

/Tomas
-- 
Tomas Forsman, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Scrub works in parallel?

2012-06-10 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Kalle Anka
 
 Assume we have 100 disks in one zpool. Assume it takes 5 hours to scrub
one
 disk. If I scrub the zpool, how long time will it take?
 
 Will it scrub one disk at a time, so it will take 500 hours, i.e. in
sequence, just
 serial? Or is it possible to run the scrub in parallel, so it takes 5h no
matter
 how many disks?

It will be approximately parallel, because it's actually scrubbing only the
used blocks, and the order it scrubs in will be approximately the order they
were written, which was intentionally parallel.

Aside from that, your question doesn't really make sense, because you don't
just stick a bunch of disks in a pool.  You make a pool out of vdev's which
are made of storage devices (in this case, disks.)  The type and size of
vdev (raidz, raidzN, mirror, etc) will greatly affect the performance, as
well as your data usage patterns.

Scrubbing is an approximately random IOPS task.  Mirrors parallelize random
IO much better than raid.

The amount of time it takes to scrub or resilver is dependent both on the
amount of used data on the vdev, and the on-disk ordering.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss