Re: ZFS...

Karl Denninger Thu, 09 May 2019 06:46:47 -0700

On 5/8/2019 19:28, Kevin P. Neal wrote:
> On Wed, May 08, 2019 at 11:28:57AM -0500, Karl Denninger wrote:
>> If you have pool(s) that are taking *two weeks* to run a scrub IMHO
>> either something is badly wrong or you need to rethink organization of
>> the pool structure -- that is, IMHO you likely either have a severe
>> performance problem with one or more members or an architectural problem
>> you *really* need to determine and fix.  If a scrub takes two weeks
>> *then a resilver could conceivably take that long as well* and that's
>> *extremely* bad as the window for getting screwed is at its worst when a
>> resilver is being run.
> Wouldn't having multiple vdevs mitigate the issue for resilvers (but not
> scrubs)? My understanding, please correct me if I'm wrong, is that a
> resilver only reads the surviving drives in that specific vdev.


Yes.

In addition while "most-modern" revisions have material improvements
(very much so) in scrub times "out of the box" a bit of tuning makes for
very material differences in older revisions.  Specifically maxinflight
can be a big deal given a reasonable amount of RAM (e.g. 16 or 32Gb) as
are async_write_min_active (raise it to "2"; you may get a bit more with
"3", but not a lot)

I have a scrub running right now and this is what it looks like:

Disks   da2   da3   da4   da5   da8   da9  da10  
KB/t  10.40 11.03   103   108   122 98.11 98.48
tps      46    45  1254  1205  1062  1324  1319
MB/s   0.46  0.48   127   127   127   127   127
%busy     0     0    48    62    97    28    31

Here's the current stat on that pool:

  pool: zs
 state: ONLINE
  scan: scrub in progress since Thu May  9 03:10:00 2019
        11.9T scanned at 643M/s, 11.0T issued at 593M/s, 12.8T total
        0 repaired, 85.58% done, 0 days 00:54:29 to go
config:

        NAME               STATE     READ WRITE CKSUM
        zs                 ONLINE       0     0     0
          raidz2-0         ONLINE       0     0     0
            gpt/rust1.eli  ONLINE       0     0     0
            gpt/rust2.eli  ONLINE       0     0     0
            gpt/rust3.eli  ONLINE       0     0     0
            gpt/rust4.eli  ONLINE       0     0     0
            gpt/rust5.eli  ONLINE       0     0     0

errors: No known data errors

Indeed it will be done in about an hour; this is an "automatic" kicked
off out of periodic.  It's comprised of 4Tb disks and is about 70%
occupied.  When I get somewhere around another 5-10% I'll swap in 6Tb
drives for the 4Tb ones and swap in 8Tb "primary" backup disks for the
existing 6Tb ones.

This particular machine has a spinning rust pool (which is this one) and
another that's comprised of 240Gb Intel 730 SSDs (fairly old as SSDs go
but much faster than spinning rust and they have power protection which
IMHO is utterly mandatory for SSDs in any environment where you actually
care about the data being there after a forced, unexpected plug-pull.) 
This machine is UPS-backed with apcupsd monitoring it so *in theory* it
should never have an unsolicited power failure without notice but "crap
happens"; a few years ago there was an undetected fault in one of the
batteries (the UPS didn't know about it despite it being programmed to
do automated self-tests and hadn't reported the fault), power glitched
and blammo -- down it went, no warning.

My current "consider those" SSDs for similar replacement or size
upgrades would likely be the Micron units -- not the fastest out there
but plenty fast, reasonably priced, available in several different
versions depending on write endurance and power-protected.

-- 
Karl Denninger
k...@denninger.net <mailto:k...@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

smime.p7s
Description: S/MIME Cryptographic Signature

Re: ZFS...

Reply via email to