Re: [zfs-discuss] Performance drop during scrub?

2010-05-03 Thread Tonmaus
 On Sun, 2 May 2010, Dave Pooser wrote:
 
  If my system is going to fail under the stress of a
 scrub, it's going to
  fail under the stress of a resilver. From my
 perspective, I'm not as scared
 
 I don't disagree with any of the opinions you stated
 except to point 
 out that resilver will usually hit the (old) hardware
 less severely 
 than scrub.  Resilver does not have to access any of
 the redundant 
 copies of data or metadata, unless they are the only
 remaining good 
 copy.
 
 Bob

Adding the perspective that scrub could consume my hard disks life may sound 
like a really good point why I should avoid scrub on my system as far as 
possible, and thus avoid experiencing performance issues in the first place, 
while using scrub.
I just don't buy this. Sorry. It's too far-fetched. I'd still prefer if the 
original issue could be fixed.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-03 Thread David Dyer-Bennet

On Sun, May 2, 2010 14:12, Richard Elling wrote:
 On May 1, 2010, at 1:56 PM, Bob Friesenhahn wrote:
 On Fri, 30 Apr 2010, Freddie Cash wrote:
 Without a periodic scrub that touches every single bit of data in the
 pool, how can you be sure
 that 10-year files that haven't been opened in 5 years are still
 intact?

 You don't.  But it seems that having two or three extra copies of the
 data on different disks should instill considerable confidence.  With
 sufficient redundancy, chances are that the computer will explode before
 it loses data due to media corruption.  The calculated time before data
 loss becomes longer than even the pyramids in Egypt could withstand.

 These calculations are based on fixed MTBF.  But disk MTBF decreases with
 age. Most disks are only rated at 3-5 years of expected lifetime. Hence,
 archivists
 use solutions with longer lifetimes (high quality tape = 30 years) and
 plans for
 migrating the data to newer media before the expected media lifetime is
 reached.
 In short, if you don't expect to read your 5-year lifetime rated disk for
 another 5 years,
 then your solution is uhmm... shall we say... in need of improvement.

Are they giving tape that long an estimated life these days?  They
certainly weren't last time I looked.

And I basically don't trust tape; too many bad experiences (ever since I
moved off of DECTape, I've been having bad experiences with tape).  The
drives are terribly expensive and I can't afford redundancy, and in thirty
years I very probably could not buy a new drive for my old tapes.

I started out a big fan of tape, but the economics have been very much
against it in the range I'm working (small; 1.2 terabytes usable on my
server currently).

I don't expect I'll keep my hard disks for 30 years; I expect I'll upgrade
them periodically, probably even within their MTBF.  (Although note that,
though tests haven't been run, the MTBF of a 5-year disk after 4 years is
nearly certainly greater than 1 year.)

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-03 Thread Richard Elling
On May 3, 2010, at 2:38 PM, David Dyer-Bennet wrote:
 On Sun, May 2, 2010 14:12, Richard Elling wrote:
 On May 1, 2010, at 1:56 PM, Bob Friesenhahn wrote:
 On Fri, 30 Apr 2010, Freddie Cash wrote:
 Without a periodic scrub that touches every single bit of data in the
 pool, how can you be sure
 that 10-year files that haven't been opened in 5 years are still
 intact?
 
 You don't.  But it seems that having two or three extra copies of the
 data on different disks should instill considerable confidence.  With
 sufficient redundancy, chances are that the computer will explode before
 it loses data due to media corruption.  The calculated time before data
 loss becomes longer than even the pyramids in Egypt could withstand.
 
 These calculations are based on fixed MTBF.  But disk MTBF decreases with
 age. Most disks are only rated at 3-5 years of expected lifetime. Hence,
 archivists
 use solutions with longer lifetimes (high quality tape = 30 years) and
 plans for
 migrating the data to newer media before the expected media lifetime is
 reached.
 In short, if you don't expect to read your 5-year lifetime rated disk for
 another 5 years,
 then your solution is uhmm... shall we say... in need of improvement.
 
 Are they giving tape that long an estimated life these days?  They
 certainly weren't last time I looked.

Yes.
http://www.oracle.com/us/products/servers-storage/storage/tape-storage/036556.pdf
http://www.sunstarco.com/PDF%20Files/Quantum%20LTO3.pdf

 And I basically don't trust tape; too many bad experiences (ever since I
 moved off of DECTape, I've been having bad experiences with tape).  The
 drives are terribly expensive and I can't afford redundancy, and in thirty
 years I very probably could not buy a new drive for my old tapes.
 
 I started out a big fan of tape, but the economics have been very much
 against it in the range I'm working (small; 1.2 terabytes usable on my
 server currently).
 
 I don't expect I'll keep my hard disks for 30 years; I expect I'll upgrade
 them periodically, probably even within their MTBF.  (Although note that,
 though tests haven't been run, the MTBF of a 5-year disk after 4 years is
 nearly certainly greater than 1 year.)

Yes, but MTBF != expected lifetime.  MTBF is defined as Mean Time Between
Failures (a rate), not Time Until Death (a lifetime).  If your MTBF was 1 year,
then the probability of failing within 1 year would be approximately 63%,
assuming an exponential distribution.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-03 Thread Richard Elling
On Apr 30, 2010, at 11:44 AM, Freddie Cash wrote:
 Sure, you don't have to scrub every single week.  But you definitely want to 
 scrub more than once over the lifetime of the pool.

Yes.  There have been studies of this and the results depend on the technical
(probabilities) and the comfort level (feeling lucky?). The technical results 
will
also depend on the quality and nature of the algorithms involved along with
the quality of the hardware.  I think it is safe to say that a scrub once per 
year
is too infrequent and a weekly scrub is too frequent for most folks.
  -- richard

ZFS storage and performance consulting at http://www.RichardElling.com






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-03 Thread Richard Elling
On Apr 29, 2010, at 11:55 AM, Katzke, Karl wrote:

 The server is a Fujitsu RX300 with a Quad Xeon 1.6GHz, 6G ram, 8x400G
 SATA through a U320SCSI-SATA box - Infortrend A08U-G1410, Sol10u8.
 
 slow disks == poor performance
 
 Should have enough oompf, but when you combine snapshot with a
 scrub/resilver, sync performance gets abysmal.. Should probably try
 adding a ZIL when u9 comes, so we can remove it again if performance
 goes crap.
 
 A separate log will not help.  Try faster disks.
 
 We're seeing the same thing in Sol10u8 with both 300gb 15k rpm SAS disks 
 in-board on a Sun x4250 and an external chassis with 1tb 7200 rpm SATA disks 
 connected via SAS. Faster disks aren't the problem; there's a fundamental 
 issue with ZFS [iscsi;nfs;cifs] share performance under scrub  resilver. 

In Solaris 10u8 (and prior releases) the default number of outstanding I/Os is
35 and (I trust, because Solaris 10 is not open source) the default max number
of scrub I/Os is 10 per vdev. If your disk is slow, then the service time for 
the
queue with 35 entries will be such that the queue depth has grown to 35 entries
and the I/O scheduler in ZFS has an opportunity to prioritize and reorder the 
queue. If your disk is fast, then you won't see this and life will be good.

In recent OpenSolaris builds, the default number of outstanding I/Os is reduced
to 4-10, by default. For slow disks, the scheduler has a greater probability of 
being able to prioritize non-scrub I/Os. Again, if your disk is fast, you won't 
see
the queue depth reach 10 and life will be good.

iostat is the preferred tool for measuring queue depth, though it would be easy
to write a dedicated tool using DTrace.

Also in OpenSolaris, there is code to throttle the scrub based on bandwidth. But
we've clearly ascertained that this is not a bandwidth problem, so a bandwidth
throttle is mostly useless... unless the disks are fast.

P.S. I don't consider any HDDs to be fast.  SSDs won.  Game over :-)
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-03 Thread David Dyer-Bennet

On Mon, May 3, 2010 17:02, Richard Elling wrote:
 On May 3, 2010, at 2:38 PM, David Dyer-Bennet wrote:
 On Sun, May 2, 2010 14:12, Richard Elling wrote:
 On May 1, 2010, at 1:56 PM, Bob Friesenhahn wrote:
 On Fri, 30 Apr 2010, Freddie Cash wrote:
 Without a periodic scrub that touches every single bit of data in the
 pool, how can you be sure
 that 10-year files that haven't been opened in 5 years are still
 intact?

 You don't.  But it seems that having two or three extra copies of the
 data on different disks should instill considerable confidence.  With
 sufficient redundancy, chances are that the computer will explode
 before
 it loses data due to media corruption.  The calculated time before
 data
 loss becomes longer than even the pyramids in Egypt could withstand.

 These calculations are based on fixed MTBF.  But disk MTBF decreases
 with
 age. Most disks are only rated at 3-5 years of expected lifetime.
 Hence,
 archivists
 use solutions with longer lifetimes (high quality tape = 30 years) and
 plans for
 migrating the data to newer media before the expected media lifetime is
 reached.
 In short, if you don't expect to read your 5-year lifetime rated disk
 for
 another 5 years,
 then your solution is uhmm... shall we say... in need of improvement.

 Are they giving tape that long an estimated life these days?  They
 certainly weren't last time I looked.

 Yes.
 http://www.oracle.com/us/products/servers-storage/storage/tape-storage/036556.pdf
 http://www.sunstarco.com/PDF%20Files/Quantum%20LTO3.pdf

Yep, they say 30 years.  That's probably in the same years where the MAM
gold archival DVDs are good for 200, I imagine.  (i.e. based on
accelerated testing, with the lab knowing what answer the client wants). 
Although we may know more about tape aging, the accelerated tests may be
more valid for tapes?

But LTO-3 is a 400GB tape that costs, hmmm, maybe $40 each (maybe less
with better shopping, that's a quick Amazon price rounded down).  (I don't
factor in compression in my own analysis because my data is overwhelmingly
image filee and MP3 files, which don't compress further very well.)

Plus a $1000 drive, or $2000 for a 3-tape changer (and that's barely big
enough to back up my small server without manual intervention, might not
be by the end of the  year).

Tape is a LOT more expensive than my current hard-drive based backup
scheme, even if I use the backup drives only three years (and since they
spin less than 10% of the time, they should last pretty well).

Also, I lose my snapshots in a tape backup, whereas I keep them on my hard
drive backups.  (Or else I'm storing a ZFS send stream on tape and hoping
it will actually restore.)

 And I basically don't trust tape; too many bad experiences (ever since I
 moved off of DECTape, I've been having bad experiences with tape).  The
 drives are terribly expensive and I can't afford redundancy, and in
 thirty
 years I very probably could not buy a new drive for my old tapes.

 I started out a big fan of tape, but the economics have been very much
 against it in the range I'm working (small; 1.2 terabytes usable on my
 server currently).

 I don't expect I'll keep my hard disks for 30 years; I expect I'll
 upgrade
 them periodically, probably even within their MTBF.  (Although note
 that,
 though tests haven't been run, the MTBF of a 5-year disk after 4 years
 is
 nearly certainly greater than 1 year.)

 Yes, but MTBF != expected lifetime.  MTBF is defined as Mean Time Between
 Failures (a rate), not Time Until Death (a lifetime).  If your MTBF was 1
 year,
 then the probability of failing within 1 year would be approximately 63%,
 assuming an exponential distribution.

Yeah, sorry, I stumbled into using the same wrong figures lots of people
were.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-02 Thread Tonmaus
Hi Bob,
 
 It is necessary to look at all the factors which
 might result in data 
 loss before deciding what the most effective steps
 are to minimize 
 the probability of loss.
 
 Bob

I am under the impression that exactly those were the considerations for both 
the ZFS designers to implement a scrub function to ZFS and the author of Best 
Practises to recommend performing this function frequently. I am hearing you 
are coming to a different conclusion and I would be interested in learning what 
could possibly be so highly interpretable in this.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-02 Thread Bob Friesenhahn

On Sun, 2 May 2010, Tonmaus wrote:


I am under the impression that exactly those were the considerations 
for both the ZFS designers to implement a scrub function to ZFS and 
the author of Best Practises to recommend performing this function 
frequently. I am hearing you are coming to a different conclusion 
and I would be interested in learning what could possibly be so 
highly interpretable in this.


The value of periodic scrub is subject to opinion.  There are some 
highly respected folks on this list who put less faith in scrub 
because they believe more in MTTDL statistical models and less in the 
value of early detection (scrub == early detection).  With a single 
level of redundancy, early detection is more useful since there is 
just one opportunity to correct the error and correcting the error 
early decreases the chance of a later uncorrectable error.  Scrub will 
help repair the results of transient wrong hardware operation, or 
partial media failures, but will not keep a whole disk from failing.


Once the computed MTTDL for the storage configuration is sufficiently 
high, then other factors such as the reliability of ECC memory, kernel 
bugs, and hardware design flaws, become dominant.  The human factor is 
often the most dominant factor when it comes to data loss since most 
data loss is still due to human error.  Most data loss problems we see 
reported here are due to human error or hardware design flaws.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-02 Thread Richard Elling
On May 1, 2010, at 1:56 PM, Bob Friesenhahn wrote:
 On Fri, 30 Apr 2010, Freddie Cash wrote:
 Without a periodic scrub that touches every single bit of data in the pool, 
 how can you be sure
 that 10-year files that haven't been opened in 5 years are still intact?
 
 You don't.  But it seems that having two or three extra copies of the data on 
 different disks should instill considerable confidence.  With sufficient 
 redundancy, chances are that the computer will explode before it loses data 
 due to media corruption.  The calculated time before data loss becomes longer 
 than even the pyramids in Egypt could withstand.

These calculations are based on fixed MTBF.  But disk MTBF decreases with 
age. Most disks are only rated at 3-5 years of expected lifetime. Hence, 
archivists
use solutions with longer lifetimes (high quality tape = 30 years) and plans 
for 
migrating the data to newer media before the expected media lifetime is 
reached.  
In short, if you don't expect to read your 5-year lifetime rated disk for 
another 5 years, 
then your solution is uhmm... shall we say... in need of improvement.

 
 The situation becomes similar to having a house with a heavy front door with 
 three deadbolt locks, and many glass windows.  The front door with its three 
 locks is no longer a concern when you are evaluating your home for its 
 security against burglary or home invasion because the glass windows are so 
 fragile and easily broken.
 
 It is necessary to look at all the factors which might result in data loss 
 before deciding what the most effective steps are to minimize the probability 
 of loss.

Yep... and manage the data over time.  There is a good reason why library 
scientists
will never worry about the future of their profession :-)
http://en.wikipedia.org/wiki/Library_science

 -- richard


ZFS storage and performance consulting at http://www.RichardElling.com




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-02 Thread Roy Sigurd Karlsbakk
- Roy Sigurd Karlsbakk r...@karlsbakk.net skrev:

 Hi all
 
 I have a test system with snv134 and 8x2TB drives in RAIDz2 and
 currently no Zil or L2ARC. I noticed the I/O speed to NFS shares on
 the testpool drops to something hardly usable while scrubbing the
 pool.
 
 How can I address this? Will adding Zil or L2ARC help? Is it possible
 to tune down scrub's priority somehow?

Further testing shows NFS speeds are acceptable after adding Zil and L2ARC (my 
test system has two SSDs for the root, so I detached on of them and split it 
into a 4GB slice for Zil and the rest for L2ARC).

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-02 Thread Bob Friesenhahn

On Sun, 2 May 2010, Richard Elling wrote:


These calculations are based on fixed MTBF.  But disk MTBF decreases with
age. Most disks are only rated at 3-5 years of expected lifetime. Hence, 
archivists
use solutions with longer lifetimes (high quality tape = 30 years) and plans for
migrating the data to newer media before the expected media lifetime is reached.
In short, if you don't expect to read your 5-year lifetime rated disk for 
another 5 years,
then your solution is uhmm... shall we say... in need of improvement.


Yes, the hardware does not last forever.  It only needs to last while 
it is still being used and should only be used during its expected 
service life.  Your point is a good one.


On the flip-side, using 'zfs scrub' puts more stress on the system 
which may make it more likely to fail.  It increases load on the power 
supplies, CPUs, interfaces, and disks.  A system which might work fine 
under normal load may be stressed and misbehave under scrub.  Using 
scrub on a weak system could actually increase the chance of data 
loss.



ZFS storage and performance consulting at http://www.RichardElling.com


Please send $$$ to the above address in return for wisdom.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-02 Thread Dave Pooser
On 5/2/10 3:12 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote:

 On the flip-side, using 'zfs scrub' puts more stress on the system
 which may make it more likely to fail.  It increases load on the power
 supplies, CPUs, interfaces, and disks.  A system which might work fine
 under normal load may be stressed and misbehave under scrub.  Using
 scrub on a weak system could actually increase the chance of data
 loss.

If my system is going to fail under the stress of a scrub, it's going to
fail under the stress of a resilver. From my perspective, I'm not as scared
of data corruption as I am of data corruption *that I don't know about.* I
only keep backups for a finite amount of time. If I scrub every week, and my
zpool dies during a scrub, then I know it's time to pull out last week's
backup, where I know (thanks to scrubbing) the data was not corrupt. I've
lived the experience where a user comes to me because he tried to open a
seven-year-old file and it was corrupt. Not a blankety-blank thing I could
do, because we only retain backup tapes for four years and the four-year-old
tape had a backup of the file post-corruption.

Data loss may be unavoidable, but that's why we keep backups. It's the
invisible data loss that makes life suboptimal.
-- 
Dave Pooser, ACSA
Manager of Information Services
Alford Media  http://www.alfordmedia.com


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-02 Thread Bob Friesenhahn

On Sun, 2 May 2010, Dave Pooser wrote:


If my system is going to fail under the stress of a scrub, it's going to
fail under the stress of a resilver. From my perspective, I'm not as scared


I don't disagree with any of the opinions you stated except to point 
out that resilver will usually hit the (old) hardware less severely 
than scrub.  Resilver does not have to access any of the redundant 
copies of data or metadata, unless they are the only remaining good 
copy.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-02 Thread Richard Elling
On May 2, 2010, at 12:05 PM, Roy Sigurd Karlsbakk wrote:
 - Roy Sigurd Karlsbakk r...@karlsbakk.net skrev:
 
 Hi all
 
 I have a test system with snv134 and 8x2TB drives in RAIDz2 and
 currently no Zil or L2ARC. I noticed the I/O speed to NFS shares on
 the testpool drops to something hardly usable while scrubbing the
 pool.
 
 How can I address this? Will adding Zil or L2ARC help? Is it possible
 to tune down scrub's priority somehow?
 
 Further testing shows NFS speeds are acceptable after adding Zil and L2ARC 
 (my test system has two SSDs for the root, so I detached on of them and split 
 it into a 4GB slice for Zil and the rest for L2ARC).

Ok, this makes sense.  If you are using a pool configuration which is not
so good for high IOPS workloads (raidz*) and you give it a latency-sensitive,
synchronous IOPS workload (NFS) along with another high IOPS workload 
(scrub), then the latency-sensitive workload will notice.  Adding the SSD as a
separate log is a good idea.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-01 Thread Bob Friesenhahn

On Fri, 30 Apr 2010, Freddie Cash wrote:


Without a periodic scrub that touches every single bit of data in the pool, how 
can you be sure
that 10-year files that haven't been opened in 5 years are still intact?


You don't.  But it seems that having two or three extra copies of the 
data on different disks should instill considerable confidence.  With 
sufficient redundancy, chances are that the computer will explode 
before it loses data due to media corruption.  The calculated time 
before data loss becomes longer than even the pyramids in Egypt could 
withstand.


The situation becomes similar to having a house with a heavy front 
door with three deadbolt locks, and many glass windows.  The front 
door with its three locks is no longer a concern when you are 
evaluating your home for its security against burglary or home 
invasion because the glass windows are so fragile and easily broken.


It is necessary to look at all the factors which might result in data 
loss before deciding what the most effective steps are to minimize 
the probability of loss.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread Tonmaus
 In my opinion periodic scrubs are most useful for
 pools based on 
 mirrors, or raidz1, and much less useful for pools
 based on raidz2 or 
 raidz3.  It is useful to run a scrub at least once on
 a well-populated 
 new pool in order to validate the hardware and OS,
 but otherwise, the 
 scrub is most useful for discovering bit-rot in
 singly-redundant 
 pools.
 
 Bob

Hi,

for once, well populated pools are rarely new. Second, Best Practises 
recommendations on scrubbing intervals are based on disk product line 
(Enterprise monthly vs. Consumer weekly), not on redundancy level or pool 
configuration. Obviously, the issue under discussion affects all imaginable 
configurations, though. It may only vary in the degree.
Recommending to not using scrub doesn't even qualify as a workaround, in my 
regard.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread David Dyer-Bennet

On Thu, April 29, 2010 17:35, Bob Friesenhahn wrote:

 In my opinion periodic scrubs are most useful for pools based on
 mirrors, or raidz1, and much less useful for pools based on raidz2 or
 raidz3.  It is useful to run a scrub at least once on a well-populated
 new pool in order to validate the hardware and OS, but otherwise, the
 scrub is most useful for discovering bit-rot in singly-redundant
 pools.

I've got 10 years of photos on my disk now, and it's growing at faster
than one year per year (since I'm scanning backwards slowly through the
negatives).  Many of them don't get accessed very often; they're archival,
not current use.  Scrub was one of the primary reasons I chose ZFS for the
fileserver they live on -- I want some assurance, 20 years from now, that
they're still valid.  I needed something to check them periodically, and
something to check *against*, and block checksums and scrub seemed to fill
the bill.

So, yes, I want to catch bit rot -- on a pool of mirrored VDEVs.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread Bob Friesenhahn

On Thu, 29 Apr 2010, Tonmaus wrote:

Recommending to not using scrub doesn't even qualify as a 
workaround, in my regard.


As a devoted believer in the power of scrub, I believe that after the 
OS, power supplies, and controller have been verified to function with 
a good scrubbing, if there is more than one level of redundancy, 
scrubs are not really warranted.  With just one level of redundancy it 
becomes much more important to verify that both copies were written to 
disk correctly.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread Freddie Cash
On Fri, Apr 30, 2010 at 11:35 AM, Bob Friesenhahn 
bfrie...@simple.dallas.tx.us wrote:

 On Thu, 29 Apr 2010, Tonmaus wrote:

  Recommending to not using scrub doesn't even qualify as a workaround, in
 my regard.


 As a devoted believer in the power of scrub, I believe that after the OS,
 power supplies, and controller have been verified to function with a good
 scrubbing, if there is more than one level of redundancy, scrubs are not
 really warranted.  With just one level of redundancy it becomes much more
 important to verify that both copies were written to disk correctly.

 Without a periodic scrub that touches every single bit of data in the pool,
how can you be sure that 10-year files that haven't been opened in 5 years
are still intact?

Self-healing only comes into play when the file is read.  If you don't read
a file for years, how can you be sure that all copies of that file haven't
succumbed to bit-rot?

Do you rally want that oh shit moment 5 years from now, when you go to
open Super Important Doc Saved for Legal Reasons and find that all copies
are corrupt?

Sure, you don't have to scrub every single week.  But you definitely want to
scrub more than once over the lifetime of the pool.

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread Roy Sigurd Karlsbakk
 On Thu, 29 Apr 2010, Tonmaus wrote:
 
  Recommending to not using scrub doesn't even qualify as a 
  workaround, in my regard.
 
 As a devoted believer in the power of scrub, I believe that after the
 
 OS, power supplies, and controller have been verified to function with
 a good scrubbing, if there is more than one level of redundancy, 
 scrubs are not really warranted.  With just one level of redundancy it
 
 becomes much more important to verify that both copies were written to
 disk correctly.

The scrub should still be available without slowing down the system to 
something barely usable - that's why it's there. Adding new layers of security 
is nice, but dropping scrub because of OS bugs is rather ugly

roy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-30 Thread David Dyer-Bennet

On Fri, April 30, 2010 13:44, Freddie Cash wrote:
 On Fri, Apr 30, 2010 at 11:35 AM, Bob Friesenhahn 
 bfrie...@simple.dallas.tx.us wrote:

 On Thu, 29 Apr 2010, Tonmaus wrote:

  Recommending to not using scrub doesn't even qualify as a workaround,
 in
 my regard.


 As a devoted believer in the power of scrub, I believe that after the
 OS,
 power supplies, and controller have been verified to function with a
 good
 scrubbing, if there is more than one level of redundancy, scrubs are not
 really warranted.  With just one level of redundancy it becomes much
 more
 important to verify that both copies were written to disk correctly.

 Without a periodic scrub that touches every single bit of data in the
 pool,
 how can you be sure that 10-year files that haven't been opened in 5 years
 are still intact?

 Self-healing only comes into play when the file is read.  If you don't
 read
 a file for years, how can you be sure that all copies of that file haven't
 succumbed to bit-rot?

Yes, that's precisely my point.  That's why it's especially relevant to
archival data -- it's important (to me), but not frequently accessed.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Bruno Sousa
Indeed the scrub seems to take too much resources from a live system.
For instance i have a server with 24 disks (SATA 1TB) serving as NFS
store to a linux machine holding user mailboxes. I have around 200
users, with maybe 30-40% of active users at the same time.
As soon as the scrub process kicks in, linux box starts to give messages
like  nfs server not available and the users start to complain that
the Outlook gives connection timeout. Again, as soon as the scrub
process stops everything comes to normal.
So for me, it's real issue the fact that the scrub takes so many
resources of the system, making it pretty much unusable. In my case i
did a *workaround, *where basically i have zfs send/receive from this
server to another server and the scrub process is now running on the
second server.
I don't know if this such a good idea, given the fact that i don't know
for sure if the scrub process in the secondary machine will be usefull
in case of data corruption...but so far so good , and it's probably
better than nothing.
I still remember before ZFS , that any good RAID controller would have a
background consistency check task, and such a task would be possible to
assign priority , like low, medium, high ...going back to ZFS what's
the possibility of getting this feature as well?


Just out as curiosity , the Sun OpenStorage appliances , or Nexenta
based ones, have any scrub task enabled by default ? I would like to get
some feedback from users that run ZFS appliances regarding the impact of
running a scrub on their appliances.


Bruno

On 28-4-2010 22:39, David Dyer-Bennet wrote:
 On Wed, April 28, 2010 10:16, Eric D. Mudama wrote:
   
 On Wed, Apr 28 at  1:34, Tonmaus wrote:
 
 Zfs scrub needs to access all written data on all
 disks and is usually
 disk-seek or disk I/O bound so it is difficult to
 keep it from hogging
 the disk resources.  A pool based on mirror devices
 will behave much
 more nicely while being scrubbed than one based on
 RAIDz2.
 
 Experience seconded entirely. I'd like to repeat that I think we
 need more efficient load balancing functions in order to keep
 housekeeping payload manageable. Detrimental side effects of scrub
 should not be a decision point for choosing certain hardware or
 redundancy concepts in my opinion.
   
 While there may be some possible optimizations, i'm sure everyone
 would love the random performance of mirror vdevs, combined with the
 redundancy of raidz3 and the space of a raidz1.  However, as in all
 systems, there are tradeoffs.
 
 The situations being mentioned are much worse than what seem reasonable
 tradeoffs to me.  Maybe that's because my intuition is misleading me about
 what's available.  But if the normal workload of a system uses 25% of its
 sustained IOPS, and a scrub is run at low priority, I'd like to think
 that during a scrub I'd see a little degradation in performance, and that
 the scrub would take 25% or so longer than it would on an idle system. 
 There's presumably some inefficiency, so the two loads don't just add
 perfectly; so maybe another 5% lost to that?  That's the big uncertainty. 
 I have a hard time believing in 20% lost to that.

 Do you think that's a reasonable outcome to hope for?  Do you think ZFS is
 close to meeting it?

 People with systems that live at 75% all day are obviously going to have
 more problems than people who live at 25%!

   


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Roy Sigurd Karlsbakk
I got this hint from Richard Elling, but haven't had time to test it much. 
Perhaps someone else could help? 

roy 

 Interesting. If you'd like to experiment, you can change the limit of the 
 number of scrub I/Os queued to each vdev. The default is 10, but that 
 is too close to the normal limit. You can see the current scrub limit via: 
 
 # echo zfs_scrub_limit/D | mdb -k 
 zfs_scrub_limit: 
 zfs_scrub_limit:10 
 
 you can change it with: 
 # echo zfs_scrub_limit/W0t2 | mdb -kw 
 zfs_scrub_limit:0xa = 0x2 
 
 # echo zfs_scrub_limit/D | mdb -k 
 zfs_scrub_limit: 
 zfs_scrub_limit:2 
 
 In theory, this should help your scenario, but I do not believe this has 
 been exhaustively tested in the lab. Hopefully, it will help. 
 -- richard 


- Bruno Sousa bso...@epinfante.com skrev: 


Indeed the scrub seems to take too much resources from a live system. 
For instance i have a server with 24 disks (SATA 1TB) serving as NFS store to a 
linux machine holding user mailboxes. I have around 200 users, with maybe 
30-40% of active users at the same time. 
As soon as the scrub process kicks in, linux box starts to give messages like  
nfs server not available and the users start to complain that the Outlook 
gives connection timeout. Again, as soon as the scrub process stops 
everything comes to normal. 
So for me, it's real issue the fact that the scrub takes so many resources of 
the system, making it pretty much unusable. In my case i did a workaround, 
where basically i have zfs send/receive from this server to another server and 
the scrub process is now running on the second server. 
I don't know if this such a good idea, given the fact that i don't know for 
sure if the scrub process in the secondary machine will be usefull in case of 
data corruption...but so far so good , and it's probably better than nothing. 
I still remember before ZFS , that any good RAID controller would have a 
background consistency check task, and such a task would be possible to assign 
priority , like low, medium, high ...going back to ZFS what's the possibility 
of getting this feature as well? 


Just out as curiosity , the Sun OpenStorage appliances , or Nexenta based ones, 
have any scrub task enabled by default ? I would like to get some feedback from 
users that run ZFS appliances regarding the impact of running a scrub on their 
appliances. 


Bruno 

On 28-4-2010 22:39, David Dyer-Bennet wrote: 

On Wed, April 28, 2010 10:16, Eric D. Mudama wrote: 

On Wed, Apr 28 at  1:34, Tonmaus wrote: 



Zfs scrub needs to access all written data on all
disks and is usually
disk-seek or disk I/O bound so it is difficult to
keep it from hogging
the disk resources.  A pool based on mirror devices
will behave much
more nicely while being scrubbed than one based on
RAIDz2. Experience seconded entirely. I'd like to repeat that I think we
need more efficient load balancing functions in order to keep
housekeeping payload manageable. Detrimental side effects of scrub
should not be a decision point for choosing certain hardware or
redundancy concepts in my opinion. While there may be some possible 
optimizations, i'm sure everyone
would love the random performance of mirror vdevs, combined with the
redundancy of raidz3 and the space of a raidz1.  However, as in all
systems, there are tradeoffs. The situations being mentioned are much worse 
than what seem reasonable
tradeoffs to me.  Maybe that's because my intuition is misleading me about
what's available.  But if the normal workload of a system uses 25% of its
sustained IOPS, and a scrub is run at low priority, I'd like to think
that during a scrub I'd see a little degradation in performance, and that
the scrub would take 25% or so longer than it would on an idle system. 
There's presumably some inefficiency, so the two loads don't just add
perfectly; so maybe another 5% lost to that?  That's the big uncertainty. 
I have a hard time believing in 20% lost to that.

Do you think that's a reasonable outcome to hope for?  Do you think ZFS is
close to meeting it?

People with systems that live at 75% all day are obviously going to have
more problems than people who live at 25%! 

-- 
This message has been scanned for viruses and 
dangerous content by MailScanner , and is 
believed to be clean. 
___ 
zfs-discuss mailing list 
zfs-discuss@opensolaris.org 
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Robert Milkowski

On 28/04/2010 21:39, David Dyer-Bennet wrote:


The situations being mentioned are much worse than what seem reasonable
tradeoffs to me.  Maybe that's because my intuition is misleading me about
what's available.  But if the normal workload of a system uses 25% of its
sustained IOPS, and a scrub is run at low priority, I'd like to think
that during a scrub I'd see a little degradation in performance, and that
the scrub would take 25% or so longer than it would on an idle system.
There's presumably some inefficiency, so the two loads don't just add
perfectly; so maybe another 5% lost to that?  That's the big uncertainty.
I have a hard time believing in 20% lost to that.

   


Well, it's not that easy as there are many other factors you need to 
take into account.
For example how many IOs are you allowing to be queued per device? This 
might affect a latency for your application.


Or if you have a disk array with its own cache - just by doing scrub you 
might be pushing other entries in a cache out which might impact the 
performance of your application.


Then there might be SAN and
and so on.

I'm not saying there is no room for improvement here. All I'm saying is 
that it is not as easy problem as it seems.


--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Tomas Ögren
On 29 April, 2010 - Roy Sigurd Karlsbakk sent me these 10K bytes:

 I got this hint from Richard Elling, but haven't had time to test it much. 
 Perhaps someone else could help? 
 
 roy 
 
  Interesting. If you'd like to experiment, you can change the limit of the 
  number of scrub I/Os queued to each vdev. The default is 10, but that 
  is too close to the normal limit. You can see the current scrub limit via: 
  
  # echo zfs_scrub_limit/D | mdb -k 
  zfs_scrub_limit: 
  zfs_scrub_limit:10 
  
  you can change it with: 
  # echo zfs_scrub_limit/W0t2 | mdb -kw 
  zfs_scrub_limit:0xa = 0x2 
  
  # echo zfs_scrub_limit/D | mdb -k 
  zfs_scrub_limit: 
  zfs_scrub_limit:2 
  
  In theory, this should help your scenario, but I do not believe this has 
  been exhaustively tested in the lab. Hopefully, it will help. 
  -- richard 

If I'm reading the code right, it's only used when creating a new vdev
(import, zpool create, maybe at boot).. So I took an alternate route:

http://pastebin.com/hcYtQcJH

(spa_scrub_maxinflight used to be 0x46 (70 decimal) due to 7 devices *
zfs_scrub_limit(10) = 70..)

With these lower numbers, our pool is much more responsive over NFS..

 scrub: scrub in progress for 0h40m, 0.10% done, 697h29m to go

Might take a while though. We've taken periodic snapshots and have
snapshots from 2008, which probably has fragmented the pool beyond
sanity or something..

 - Bruno Sousa bso...@epinfante.com skrev: 
 
 
 Indeed the scrub seems to take too much resources from a live system. 
 For instance i have a server with 24 disks (SATA 1TB) serving as NFS store to 
 a linux machine holding user mailboxes. I have around 200 users, with maybe 
 30-40% of active users at the same time. 
 As soon as the scrub process kicks in, linux box starts to give messages like 
  nfs server not available and the users start to complain that the Outlook 
 gives connection timeout. Again, as soon as the scrub process stops 
 everything comes to normal. 
 So for me, it's real issue the fact that the scrub takes so many resources of 
 the system, making it pretty much unusable. In my case i did a workaround, 
 where basically i have zfs send/receive from this server to another server 
 and the scrub process is now running on the second server. 
 I don't know if this such a good idea, given the fact that i don't know for 
 sure if the scrub process in the secondary machine will be usefull in case of 
 data corruption...but so far so good , and it's probably better than nothing. 
 I still remember before ZFS , that any good RAID controller would have a 
 background consistency check task, and such a task would be possible to 
 assign priority , like low, medium, high ...going back to ZFS what's the 
 possibility of getting this feature as well? 
 
 
 Just out as curiosity , the Sun OpenStorage appliances , or Nexenta based 
 ones, have any scrub task enabled by default ? I would like to get some 
 feedback from users that run ZFS appliances regarding the impact of running a 
 scrub on their appliances. 
 
 
 Bruno 
 
 On 28-4-2010 22:39, David Dyer-Bennet wrote: 
 
 On Wed, April 28, 2010 10:16, Eric D. Mudama wrote: 
 
 On Wed, Apr 28 at  1:34, Tonmaus wrote: 
 
 
 
 Zfs scrub needs to access all written data on all
 disks and is usually
 disk-seek or disk I/O bound so it is difficult to
 keep it from hogging
 the disk resources.  A pool based on mirror devices
 will behave much
 more nicely while being scrubbed than one based on
 RAIDz2. Experience seconded entirely. I'd like to repeat that I think we
 need more efficient load balancing functions in order to keep
 housekeeping payload manageable. Detrimental side effects of scrub
 should not be a decision point for choosing certain hardware or
 redundancy concepts in my opinion. While there may be some possible 
 optimizations, i'm sure everyone
 would love the random performance of mirror vdevs, combined with the
 redundancy of raidz3 and the space of a raidz1.  However, as in all
 systems, there are tradeoffs. The situations being mentioned are much worse 
 than what seem reasonable
 tradeoffs to me.  Maybe that's because my intuition is misleading me about
 what's available.  But if the normal workload of a system uses 25% of its
 sustained IOPS, and a scrub is run at low priority, I'd like to think
 that during a scrub I'd see a little degradation in performance, and that
 the scrub would take 25% or so longer than it would on an idle system. 
 There's presumably some inefficiency, so the two loads don't just add
 perfectly; so maybe another 5% lost to that?  That's the big uncertainty. 
 I have a hard time believing in 20% lost to that.
 
 Do you think that's a reasonable outcome to hope for?  Do you think ZFS is
 close to meeting it?
 
 People with systems that live at 75% all day are obviously going to have
 more problems than people who live at 25%! 
 
 -- 
 This message has been scanned for viruses and 
 dangerous content by MailScanner , and is 
 

Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Tomas Ögren
On 29 April, 2010 - Tomas Ögren sent me these 5,8K bytes:

 On 29 April, 2010 - Roy Sigurd Karlsbakk sent me these 10K bytes:
 
  I got this hint from Richard Elling, but haven't had time to test it much. 
  Perhaps someone else could help? 
  
  roy 
  
   Interesting. If you'd like to experiment, you can change the limit of the 
   number of scrub I/Os queued to each vdev. The default is 10, but that 
   is too close to the normal limit. You can see the current scrub limit 
   via: 
   
   # echo zfs_scrub_limit/D | mdb -k 
   zfs_scrub_limit: 
   zfs_scrub_limit:10 
   
   you can change it with: 
   # echo zfs_scrub_limit/W0t2 | mdb -kw 
   zfs_scrub_limit:0xa = 0x2 
   
   # echo zfs_scrub_limit/D | mdb -k 
   zfs_scrub_limit: 
   zfs_scrub_limit:2 
   
   In theory, this should help your scenario, but I do not believe this has 
   been exhaustively tested in the lab. Hopefully, it will help. 
   -- richard 
 
 If I'm reading the code right, it's only used when creating a new vdev
 (import, zpool create, maybe at boot).. So I took an alternate route:
 
 http://pastebin.com/hcYtQcJH
 
 (spa_scrub_maxinflight used to be 0x46 (70 decimal) due to 7 devices *
 zfs_scrub_limit(10) = 70..)
 
 With these lower numbers, our pool is much more responsive over NFS..

But taking snapshots is quite bad.. A single recursive snapshot over
~800 filesystems took about 45 minutes, with NFS operations taking 5-10
seconds.. Snapshots usually take 10-30 seconds..

  scrub: scrub in progress for 0h40m, 0.10% done, 697h29m to go

 scrub: scrub in progress for 1h41m, 2.10% done, 78h35m to go

This is chugging along..

The server is a Fujitsu RX300 with a Quad Xeon 1.6GHz, 6G ram, 8x400G
SATA through a U320SCSI-SATA box - Infortrend A08U-G1410, Sol10u8.
Should have enough oompf, but when you combine snapshot with a
scrub/resilver, sync performance gets abysmal.. Should probably try
adding a ZIL when u9 comes, so we can remove it again if performance
goes crap.

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Richard Elling
On Apr 29, 2010, at 5:52 AM, Tomas Ögren wrote:

 On 29 April, 2010 - Tomas Ögren sent me these 5,8K bytes:
 
 On 29 April, 2010 - Roy Sigurd Karlsbakk sent me these 10K bytes:
 
 I got this hint from Richard Elling, but haven't had time to test it much. 
 Perhaps someone else could help? 
 
 roy 
 
 Interesting. If you'd like to experiment, you can change the limit of the 
 number of scrub I/Os queued to each vdev. The default is 10, but that 
 is too close to the normal limit. You can see the current scrub limit via: 
 
 # echo zfs_scrub_limit/D | mdb -k 
 zfs_scrub_limit: 
 zfs_scrub_limit:10 
 
 you can change it with: 
 # echo zfs_scrub_limit/W0t2 | mdb -kw 
 zfs_scrub_limit:0xa = 0x2 
 
 # echo zfs_scrub_limit/D | mdb -k 
 zfs_scrub_limit: 
 zfs_scrub_limit:2 
 
 In theory, this should help your scenario, but I do not believe this has 
 been exhaustively tested in the lab. Hopefully, it will help. 
 -- richard 
 
 If I'm reading the code right, it's only used when creating a new vdev
 (import, zpool create, maybe at boot).. So I took an alternate route:
 
 http://pastebin.com/hcYtQcJH
 
 (spa_scrub_maxinflight used to be 0x46 (70 decimal) due to 7 devices *
 zfs_scrub_limit(10) = 70..)
 
 With these lower numbers, our pool is much more responsive over NFS..
 
 But taking snapshots is quite bad.. A single recursive snapshot over
 ~800 filesystems took about 45 minutes, with NFS operations taking 5-10
 seconds.. Snapshots usually take 10-30 seconds..
 
 scrub: scrub in progress for 0h40m, 0.10% done, 697h29m to go
 
 scrub: scrub in progress for 1h41m, 2.10% done, 78h35m to go
 
 This is chugging along..
 
 The server is a Fujitsu RX300 with a Quad Xeon 1.6GHz, 6G ram, 8x400G
 SATA through a U320SCSI-SATA box - Infortrend A08U-G1410, Sol10u8.

slow disks == poor performance

 Should have enough oompf, but when you combine snapshot with a
 scrub/resilver, sync performance gets abysmal.. Should probably try
 adding a ZIL when u9 comes, so we can remove it again if performance
 goes crap.

A separate log will not help.  Try faster disks.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Tomas Ögren
On 29 April, 2010 - Richard Elling sent me these 2,5K bytes:

  With these lower numbers, our pool is much more responsive over NFS..
  
  But taking snapshots is quite bad.. A single recursive snapshot over
  ~800 filesystems took about 45 minutes, with NFS operations taking 5-10
  seconds.. Snapshots usually take 10-30 seconds..
  
  scrub: scrub in progress for 0h40m, 0.10% done, 697h29m to go
  
  scrub: scrub in progress for 1h41m, 2.10% done, 78h35m to go
  
  This is chugging along..
  
  The server is a Fujitsu RX300 with a Quad Xeon 1.6GHz, 6G ram, 8x400G
  SATA through a U320SCSI-SATA box - Infortrend A08U-G1410, Sol10u8.
 
 slow disks == poor performance

I know they're not fast, but they're not should take 10-30 seconds to
create a directory. They do perfectly well in all combinations, except
when a scrub comes along (or sometimes when a snapshot feels like taking
45 minutes instead of 4.5 seconds). iostat says the disks aren't 100%
busy, the storage box itself doesn't seem to be busy, yet with zfs they
go downhill in some conditions..

  Should have enough oompf, but when you combine snapshot with a
  scrub/resilver, sync performance gets abysmal.. Should probably try
  adding a ZIL when u9 comes, so we can remove it again if performance
  goes crap.
 
 A separate log will not help.  Try faster disks.

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Katzke, Karl
  The server is a Fujitsu RX300 with a Quad Xeon 1.6GHz, 6G ram, 8x400G
  SATA through a U320SCSI-SATA box - Infortrend A08U-G1410, Sol10u8.

 slow disks == poor performance

  Should have enough oompf, but when you combine snapshot with a
  scrub/resilver, sync performance gets abysmal.. Should probably try
  adding a ZIL when u9 comes, so we can remove it again if performance
  goes crap.

 A separate log will not help.  Try faster disks.

We're seeing the same thing in Sol10u8 with both 300gb 15k rpm SAS disks 
in-board on a Sun x4250 and an external chassis with 1tb 7200 rpm SATA disks 
connected via SAS. Faster disks aren't the problem; there's a fundamental issue 
with ZFS [iscsi;nfs;cifs] share performance under scrub  resilver. 

-K 

--- 
Karl Katzke
Systems Analyst II
TAMU DRGS




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Bob Friesenhahn

On Thu, 29 Apr 2010, Roy Sigurd Karlsbakk wrote:


While there may be some possible optimizations, i'm sure everyone
would love the random performance of mirror vdevs, combined with the
redundancy of raidz3 and the space of a raidz1.  However, as in all
systems, there are tradeoffs.


In my opinion periodic scrubs are most useful for pools based on 
mirrors, or raidz1, and much less useful for pools based on raidz2 or 
raidz3.  It is useful to run a scrub at least once on a well-populated 
new pool in order to validate the hardware and OS, but otherwise, the 
scrub is most useful for discovering bit-rot in singly-redundant 
pools.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-29 Thread Ian Collins

On 04/30/10 10:35 AM, Bob Friesenhahn wrote:

On Thu, 29 Apr 2010, Roy Sigurd Karlsbakk wrote:


While there may be some possible optimizations, i'm sure everyone
would love the random performance of mirror vdevs, combined with the
redundancy of raidz3 and the space of a raidz1.  However, as in all
systems, there are tradeoffs.


In my opinion periodic scrubs are most useful for pools based on 
mirrors, or raidz1, and much less useful for pools based on raidz2 or 
raidz3.  It is useful to run a scrub at least once on a well-populated 
new pool in order to validate the hardware and OS, but otherwise, the 
scrub is most useful for discovering bit-rot in singly-redundant pools.



I agree.

I look after an x4500 with a poll of raidz2 vdevs that I can't run 
scrubs on due the the dire impact on performance. That's one reason I'd 
never use raidz1 in a real system.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-28 Thread Tonmaus
 Zfs scrub needs to access all written data on all
 disks and is usually 
 disk-seek or disk I/O bound so it is difficult to
 keep it from hogging 
 the disk resources.  A pool based on mirror devices
 will behave much 
 more nicely while being scrubbed than one based on
 RAIDz2.

Experience seconded entirely. I'd like to repeat that I think we need more 
efficient load balancing functions in order to keep housekeeping payload 
manageable. Detrimental side effects of scrub should not be a decision point 
for choosing certain hardware or redundancy concepts in my opinion. 

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-28 Thread Richard Elling
On Apr 28, 2010, at 1:34 AM, Tonmaus wrote:
 Zfs scrub needs to access all written data on all
 disks and is usually 
 disk-seek or disk I/O bound so it is difficult to
 keep it from hogging 
 the disk resources.  A pool based on mirror devices
 will behave much 
 more nicely while being scrubbed than one based on
 RAIDz2.

The data I have does not show a difference in the disk loading 
while scrubbing for different pool configs.  All HDDs become IOPS 
bound.

If you have SSDs, then there can be a bandwidth issue.  As soon as
someone sends me a big pile of SSDs, I'll characterize the scrub load :-)

 Experience seconded entirely. I'd like to repeat that I think we need more 
 efficient load balancing functions in order to keep housekeeping payload 
 manageable.

The current load balancing is based on number of queued I/Os.
For later builds, I think the algorithm might be out of balance, but
the jury is still out.

 Detrimental side effects of scrub should not be a decision point for choosing 
 certain hardware or redundancy concepts in my opinion. 

Scrub performance is directly impacted by IOPS performance of the 
pool.  Slow disks == poor performance.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-28 Thread Eric D. Mudama

On Wed, Apr 28 at  1:34, Tonmaus wrote:

Zfs scrub needs to access all written data on all
disks and is usually
disk-seek or disk I/O bound so it is difficult to
keep it from hogging
the disk resources.  A pool based on mirror devices
will behave much
more nicely while being scrubbed than one based on
RAIDz2.


Experience seconded entirely. I'd like to repeat that I think we
need more efficient load balancing functions in order to keep
housekeeping payload manageable. Detrimental side effects of scrub
should not be a decision point for choosing certain hardware or
redundancy concepts in my opinion.


While there may be some possible optimizations, i'm sure everyone
would love the random performance of mirror vdevs, combined with the
redundancy of raidz3 and the space of a raidz1.  However, as in all
systems, there are tradeoffs.

To scrub a long lived, full pool, you must read essentially every
sector on every component device, and if you're going to do it in the
order in which your transactions occurred, it'll wind up devolving to
random IO eventually.

You can choose to bias your workloads so that foreground IO takes
priority over scrub, but then you've got the cases where people
complain that their scrub takes too long.  There may be knobs for
individuals to use, but I don't think overall there's a magic answer.

--eric


--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-28 Thread Tonmaus
Hi Eric,

 While there may be some possible optimizations, i'm
 sure everyone
 would love the random performance of mirror vdevs,
 combined with the
 redundancy of raidz3 and the space of a raidz1.
  However, as in all
 ystems, there are tradeoffs.

I think we all may agree that the topic here is scrub trade-offs, specifically. 
My question is if manageability of the pool, and that includes periodical 
scrubs, is a trade-off as well. It would be very bad news, if it were. 
Maintenance functions should be practicable on any supported configuration, if 
possible.
 
 You can choose to bias your workloads so that
 foreground IO takes
 priority over scrub, but then you've got the cases
 where people
 complain that their scrub takes too long.  There may
 be knobs for
 individuals to use, but I don't think overall there's
 a magic answer.

The priority balance only works as long as the IO is within ZFS. As soon as the 
request is in the pipe of the controller/disk, no further bias will occur, as 
that subsystem is agnostic to ZFS rules. This is where Richards answer, just 
above if you read this from jive, kicks in. This leads to the pool being 
basically not operational from a production POV during scrub pass. From that 
perspective, any scrub pass exceeding a periodically acceptable service window 
is too long. In such a situation, a pause option for resuming scrub passes 
upon the next service window might help. The advantage: such an option would be 
usable on any hardware.

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-28 Thread Tomas Ögren
On 28 April, 2010 - Eric D. Mudama sent me these 1,6K bytes:

 On Wed, Apr 28 at  1:34, Tonmaus wrote:
 Zfs scrub needs to access all written data on all
 disks and is usually
 disk-seek or disk I/O bound so it is difficult to
 keep it from hogging
 the disk resources.  A pool based on mirror devices
 will behave much
 more nicely while being scrubbed than one based on
 RAIDz2.

 Experience seconded entirely. I'd like to repeat that I think we
 need more efficient load balancing functions in order to keep
 housekeeping payload manageable. Detrimental side effects of scrub
 should not be a decision point for choosing certain hardware or
 redundancy concepts in my opinion.

 While there may be some possible optimizations, i'm sure everyone
 would love the random performance of mirror vdevs, combined with the
 redundancy of raidz3 and the space of a raidz1.  However, as in all
 systems, there are tradeoffs.

 To scrub a long lived, full pool, you must read essentially every
 sector on every component device, and if you're going to do it in the
 order in which your transactions occurred, it'll wind up devolving to
 random IO eventually.

 You can choose to bias your workloads so that foreground IO takes
 priority over scrub, but then you've got the cases where people
 complain that their scrub takes too long.  There may be knobs for
 individuals to use, but I don't think overall there's a magic answer.

We have one system with a raidz2 of 8 SATA disks.. If we start a scrub,
then you can kiss any NFS performance goodbye.. A single mkdir or
creating a file can take 30 seconds.. Single write()s can take 5-30
seconds.. Without the scrub, it's perfectly fine. Local performance
during scrub is fine. NFS performance becomes useless.

This means we can't do a scrub, because doing so will basically disable
the NFS service for a day or three. If the scrub would be less agressive
and take a week to perform, it would probably not kill the performance
as bad..

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-28 Thread Bob Friesenhahn

On Wed, 28 Apr 2010, Richard Elling wrote:

the disk resources.  A pool based on mirror devices will behave 
much more nicely while being scrubbed than one based on RAIDz2.


The data I have does not show a difference in the disk loading while 
scrubbing for different pool configs.  All HDDs become IOPS bound.


It is true that all HDDs become IOPS bound, but the mirror 
configuration offers more usable IOPS and therefore the user waits for 
less time for their request to be satisfied.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-28 Thread Richard Elling
adding on...

On Apr 28, 2010, at 8:57 AM, Tomas Ögren wrote:

 On 28 April, 2010 - Eric D. Mudama sent me these 1,6K bytes:
 
 On Wed, Apr 28 at  1:34, Tonmaus wrote:
 Zfs scrub needs to access all written data on all
 disks and is usually
 disk-seek or disk I/O bound so it is difficult to
 keep it from hogging
 the disk resources.  A pool based on mirror devices
 will behave much
 more nicely while being scrubbed than one based on
 RAIDz2.
 
 Experience seconded entirely. I'd like to repeat that I think we
 need more efficient load balancing functions in order to keep
 housekeeping payload manageable. Detrimental side effects of scrub
 should not be a decision point for choosing certain hardware or
 redundancy concepts in my opinion.
 
 While there may be some possible optimizations, i'm sure everyone
 would love the random performance of mirror vdevs, combined with the
 redundancy of raidz3 and the space of a raidz1.  However, as in all
 systems, there are tradeoffs.
 
 To scrub a long lived, full pool, you must read essentially every
 sector on every component device, and if you're going to do it in the
 order in which your transactions occurred, it'll wind up devolving to
 random IO eventually.
 
 You can choose to bias your workloads so that foreground IO takes
 priority over scrub, but then you've got the cases where people
 complain that their scrub takes too long.  There may be knobs for
 individuals to use, but I don't think overall there's a magic answer.
 
 We have one system with a raidz2 of 8 SATA disks.. If we start a scrub,
 then you can kiss any NFS performance goodbye.. A single mkdir or
 creating a file can take 30 seconds.. Single write()s can take 5-30
 seconds.. Without the scrub, it's perfectly fine. Local performance
 during scrub is fine. NFS performance becomes useless.
 
 This means we can't do a scrub, because doing so will basically disable
 the NFS service for a day or three. If the scrub would be less agressive
 and take a week to perform, it would probably not kill the performance
 as bad..

Which OS release?

Later versions of ZFS have a scrub/resilver throttle of sorts.
There is not an exposed interface to manage the throttle and I doubt
there is much (if any) community experience with using it in real-world
situations.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-28 Thread David Dyer-Bennet

On Wed, April 28, 2010 10:16, Eric D. Mudama wrote:
 On Wed, Apr 28 at  1:34, Tonmaus wrote:
 Zfs scrub needs to access all written data on all
 disks and is usually
 disk-seek or disk I/O bound so it is difficult to
 keep it from hogging
 the disk resources.  A pool based on mirror devices
 will behave much
 more nicely while being scrubbed than one based on
 RAIDz2.

 Experience seconded entirely. I'd like to repeat that I think we
 need more efficient load balancing functions in order to keep
 housekeeping payload manageable. Detrimental side effects of scrub
 should not be a decision point for choosing certain hardware or
 redundancy concepts in my opinion.

 While there may be some possible optimizations, i'm sure everyone
 would love the random performance of mirror vdevs, combined with the
 redundancy of raidz3 and the space of a raidz1.  However, as in all
 systems, there are tradeoffs.

The situations being mentioned are much worse than what seem reasonable
tradeoffs to me.  Maybe that's because my intuition is misleading me about
what's available.  But if the normal workload of a system uses 25% of its
sustained IOPS, and a scrub is run at low priority, I'd like to think
that during a scrub I'd see a little degradation in performance, and that
the scrub would take 25% or so longer than it would on an idle system. 
There's presumably some inefficiency, so the two loads don't just add
perfectly; so maybe another 5% lost to that?  That's the big uncertainty. 
I have a hard time believing in 20% lost to that.

Do you think that's a reasonable outcome to hope for?  Do you think ZFS is
close to meeting it?

People with systems that live at 75% all day are obviously going to have
more problems than people who live at 25%!

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-27 Thread Bob Friesenhahn

On Tue, 27 Apr 2010, Roy Sigurd Karlsbakk wrote:



I have a test system with snv134 and 8x2TB drives in RAIDz2 and 
currently no Zil or L2ARC. I noticed the I/O speed to NFS shares on 
the testpool drops to something hardly usable while scrubbing the 
pool.


How can I address this? Will adding Zil or L2ARC help? Is it 
possible to tune down scrub's priority somehow?


Does the NFS performance problem seem to be mainly read performance, 
or write performance?  If it is primarily a read performance issue, 
then adding lots more RAM and/or a L2ARC device should help since that 
would reduce the need to (re-)read the underlying disks during the 
scrub.  Likewise, adding an intent log SSD would help with NFS write 
performance.


Zfs scrub needs to access all written data on all disks and is usually 
disk-seek or disk I/O bound so it is difficult to keep it from hogging 
the disk resources.  A pool based on mirror devices will behave much 
more nicely while being scrubbed than one based on RAIDz2.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-27 Thread Ian Collins

On 04/28/10 03:17 AM, Roy Sigurd Karlsbakk wrote:

Hi all

I have a test system with snv134 and 8x2TB drives in RAIDz2 and currently no 
Zil or L2ARC. I noticed the I/O speed to NFS shares on the testpool drops to 
something hardly usable while scrubbing the pool.

   

Is that small random or block I/O?

I've found latency to be the killer rather than throughput, at lest when 
receiving snapshots.  In normal operation, receiving an empty snapshot 
is a sub-second operation.  While resilvering, at can take up to 30 
seconds.  The write speed on bigger snapshots is still acceptable.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-04-27 Thread Ian Collins

On 04/28/10 10:01 AM, Bob Friesenhahn wrote:

On Wed, 28 Apr 2010, Ian Collins wrote:


On 04/28/10 03:17 AM, Roy Sigurd Karlsbakk wrote:

Hi all

I have a test system with snv134 and 8x2TB drives in RAIDz2 and 
currently no Zil or L2ARC. I noticed the I/O speed to NFS shares on 
the testpool drops to something hardly usable while scrubbing the pool.




Is that small random or block I/O?

I've found latency to be the killer rather than throughput, at lest 
when receiving snapshots.  In normal operation, receiving an empty 
snapshot is a sub-second operation.  While resilvering, at can take 
up to 30 seconds.  The write speed on bigger snapshots is still 
acceptable.



zfs scrub != zfs send


Where did I say it did?  I didn't even mention zfs send.

My observation concerned poor performance (latency) during a scrub/resilver.

--

Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss