On Sunday, May 3, 2020 6:27 PM, Jack <[email protected]> wrote:

> Minor point - you have one duplicate line there ". f  f ." which is the
> second and last line of the second group.  No effect on anything else in
> the discussion.

thanks.

> Trying to help thinking about odd numbers of disks, if you are still
> allowing only one disk to fail, then you can think about mirroring half
> disks, so each disk has half of it mirrored to a different disk, instead
> of drives always being mirrored in pairs.

that definitely helped get me unstuck and continue
thinking.  thanks.

curious.  how do people look at --layout=n2 in the
storage industry?  e.g. do they ignore the
optimistic case where 2 disk failures can be
recovered, and only assume that it protects for 1
disk failure?

i see why gambling is not worth it here, but at
the same time, i see no reason to ignore reality
(that a 2 disk failure can be saved).

e.g. a 4-disk RAID10 with -layout=n2 gives

        1*4/10 + 2*4/10 = 1.2

expected recoverable disk failures.  details are
below:

  F   .       .   .       < recoverable
  .   F       .   .       < cases with
  .   .       F   .       < 1 disk
  .   .       .   F       < failure

  F   .       .   F       < recoverable
  .   F       F   .       < cases with
  .   F       .   F       < 2 disk
  F   .       F   .       < failures

  F   F       .   .       < not recoverable
  .   .       F   F       < cases with 2 disk
                          < failures

now, if we do a 5-disk --layout=n2, we get:

    1    (1)    2    (2)    3
   (3)    4    (4)    5    (5)
    6    (6)    7    (7)    8
   (8)    9    (9)    10   (10)
    11   (11)   12   (12)   13
   (13) ...

obviously, there are 5 possible ways a single disk
may fail, out of which all of the 5 will be
recovered.

there are nchoosek(5,2) = 10 possible ways a 2
disk failure could happen, out of which 5
will be recovered:

   xxx   (1)   xxx   (2)    3
   xxx    4    xxx    5    (5)

   xxx   (1)    2    xxx    3
   xxx    4    (4)   xxx   (5)


    1    xxx    2    xxx    3
   (3)   xxx   (4)   xxx   (5)

    1    xxx    2    (2)   xxx
   (3)   xxx   (4)    5    xxx


    1    (1)   xxx   (2)   xxx
   (3)    4    xxx    5    xxx

so, expected recoverable disk failures for a
5-disk RAID10 --layout=n2 is:

        1*5/15 + 2*5/15 = 1

so, by transforming a 4-disk RAID10 into a 5-disk
one, we increase total storage capacity by a 0.5
disk's worth of storage, while losing the ability
to recover 0.2 disks.

but if we extended the 4-disk RAID10 into a
6-disk --layout=n2, we will have:

             6                  nchoosek(6,2) - 3
= 1 * -----------------  +  2 * -----------------
      6 + nchoosek(6,2)         6 + nchoosek(6,2)

= 6/21                   +  2 * 12/15

= 1.8857 expected recoverable failing disks.

almost 2.  i.e. there is 80% chance of surviving a
2 disk failure.

so, i wonder, is it a bad decision to go with an
even number disks with a RAID10?  what is the
right way to think to find an answer to this
question?

i guess the ultimate answer needs knowledge of
these:

    * F1: probability of having 1 disks fail within
          the repair window.
    * F2: probability of having 2 disks fail within
          the repair window.
    * F3: probability of having 3 disks fail within
      .   the repair window.
      .
      .
    * Fn: probability of having n disks fail within
          the repair window.

    * R1: probability of surviving 1 disks failure.
          equals 1 with all related cases.
    * R2: probability of surviving 2 disks failure.
          equals 1/3 with 5-disk RAID10
          equals 0.8 with a 6-disk RAID10.
    * R3: probability of surviving 3 disks failure.
          equals 0 with all related cases.
      .
      .
      .
    * Rn: probability of surviving n disks failure.
          equals 0 with all related cases.

    * L : expected cost of losing data on an array.
    * D : price of a disk.

this way, the absolute expected cost when adopting
a 6-disk RAID10 is:

= 6D + F1*(1-R1)*L + F2*(1-R2)*L + F3*(1-R3)*L + ...
= 6D + F1*(1-1)*L + F2*(1-0.8)*L + F3*(1-0)*L + ...
= 6D + 0          + F2*(0.2)*L   + F3*(1-0)*L + ...

and the absolute cost for a 5-disk RAID10 is:

= 5D + F1*(1-1)*L + F2*(1-0.3333)*L + F3*(1-0)*L + ...
= 5D + 0          + F2*(0.6667)*L   + F3*(1-0)*L + ...

canceling identical terms, the difference cost is:

6-disk ===> 6D + 0.2*F2*L
5-disk ===> 5D + 0.6667*F2*L

from here [1] we know that a 1TB disk costs
$35.85, so:

6-disk ===> 6*35.85 + 0.2*F2*L
5-disk ===> 5*35.85 + 0.6667*F2*L

now, at which point is a 5-disk array a better
economical decision than a 6-disk one?  for
simplicity, let LOL = F2*L:

5*35.85 + 0.6667 * LOL  <   6*35.85 + 0.2 * LOL
0.6667*LOL - 0.2 * LOL  <   6*35.85 - 5*35.85
LOL * (0.6667 - 0.2)    <   6*35.85 - 5*35.85

                            6*35.85 - 5*35.85
           LOL          <   -----------------
                              0.6667 - 0.2

           LOL          <   76.816
           F2*L         <   76.816

so, a 5-disk RAID10 is better than a 6-disk RAID10
only if:

        F2*L  <  76.816 bucks.

this site [2] says that 76% of seagate disks fail
per year (:D).  and since disks fail independent
of each other mostly, then, the probabilty of
having 2 disks fail in a year is:

        F2_year = 0.76*0.76
                = 0.5776

but what is F2_week?  each year has 52.1429 weeks.
let's be generous and assume that disks fail at a
uniform distribution across the year (e.g. suppose
that we bought them randomlyly, and not in a
single batch).

in this case, the probability of 2 disks failing
in the same week (suppose that our repair window
is 1 week):

                          52
    F2 = 0.5776 * --------------------
                 52 + nchoosek(52, 2)

       = 0.5776 * 0.037736
       = 0.021796

let's substitute a bit:

        F2 * L  <  76.816  bucks.
  0.021796 * L  <  76.816  bucks.
             L  <  76.816 / 0.021796  bucks.
             L  <  3524.3  bucks.

so, in summary:

 /------------------------------------------------\
 | a 5-disk RAID10 is better than a 6-disk RAID10 |
 | ONLY IF your data is WORTH LESS than 3,524.3   |
 | bucks.                                         |
 \------------------------------------------------/

any thoughts?  i'm a newbie.  i wonder how
industry people think?

happy quarantine,
cm

------------
[1] https://www.amazon.com/WD-Blue-1TB-Hard-Drive/dp/B0088PUEPK/
[2] 
https://www.seagate.com/em/en/support/kb/hard-disk-drive-reliability-and-mtbf-afr-174791en/


Reply via email to