Re: raid recomendation

2012-12-08 Thread Stan Hoeppner
On 12/7/2012 5:48 PM, Aaron Toponce wrote:

 A RAID-1 will outperform
 a parity-based RAID using the same disks every time, due to calculating the
 parity. 

This hasn't been true for almost a decade.  Even the slowest of modern
single core x86 CPUs have plenty of excess horsepower for soft RAID
parity, as do ASICs on hardware RAID solutions.  There are two big
problems today with parity arrays.

The first is the IO load and latency of read-modify-write cycles which
occur when partial stripes are updated.  Most write IO is small and
random, typically comprising over 95% of writes for most typically
workloads, mail, file, LDAP, SQL servers for example. Streaming
applications such as video are an exception as they write full stripes.
 Thus for every random write, you first must, at a minimum, read two
disks (data chunk/strip and parity chunk/strip) and write back to both
with the new data chunk and parity chunk.  This is with an optimized
algorithm.  The md/RAID driver has some optimizations to cut down on RMW
penalties.  Many hardware RAID solutions read then write the entire
stripe for scrubbing purposes (i.e. write all disks frequently so media
errors are caught sooner rather than later).  This is a data integrity
feature of higher end controllers.  This implementation is much slower
due to all the extra IO and head movement, but more reliable.

The second is that failed drive rebuilds take FOREVER as all disks are
being read in parallel and parity calculated for every stripe, just to
rebuild one disk.  Even a small count 2TB drive RAID6 array can take
12-24 hours to rebuild.  The recommended max array drive count for
RAID5/6 are 4 and 8 drives respectively.  One of the reasons for this
BCP is rebuild time.  With RAID10 rebuild time is a constant, as you're
simply copying all the sectors from one drive to another.  A 60x2TB
drive RAID10 rebuild will take about 5 hours with low normal workload IO
hitting the array.

 Further, striping across two mirrors will give increased
 performance that parity-based RAID cannot achieve. 

A parity array actually has superior read speed vs a RAID10 array of the
same total spindle count because there are more data spindles.  An 8
drive RAID6 has 6 data spindles, whereas an 8 drive RAID10 only has 4.
Write performance, however, as I mentioned, is an order of magnitude
slower due to RMW.

 Lastly, you can suffer
 any sort of disk failures, provided all mirrors in the stripe remains in
 tact.

You mean any number not sort.  Yes, with RAID10 you can lose half
the drives in the array as long as no two are in the same mirror pair.
I wouldn't bank on this though.  Many drive dropouts are not due to
problems with the drives, but with the backplanes and cabling.  When
that happens, if you've not staggered your mirrors across HBAs, cables,
and cabinets (which isn't possible with RAID HBAs), you may very well
lose two drives in the same mirror.

 1: http://zfsonlinux.org

 Just my $.02.

And that sums up the value of your ZFS on Linux recommendation, quite
well.  Being a fanboy is fine.  Run it yourself.  But please don't
recommend unsupported, out of tree, difficult for the average Debian
user to install, software, for a general purpose storage solution.

Good hardware RAID is relatively cheap, Linux has md/RAID which isn't
horrible for most workloads, and there are plenty of high quality Linux
filesystems to meet most needs, with EXT4 for casual stuff, JFS and XFS
for heavy duty, though XFS is a much better choice for many reasons; the
big one being that it's actively developed, whereas JFS is mostly in
maintenance only mode.

-- 
Stan


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/50c30205.2050...@hardwarefreak.com



RE : Re: raid recomendation

2012-12-08 Thread Nicolas Froidure



Envoyé depuis un mobile SamsungStan Hoeppner s...@hardwarefreak.com a écrit 
:On 12/7/2012 5:48 PM, Aaron Toponce wrote:

 A RAID-1 will outperform
 a parity-based RAID using the same disks every time, due to calculating the
 parity. 

This hasn't been true for almost a decade.  Even the slowest of modern
single core x86 CPUs have plenty of excess horsepower for soft RAID
parity, as do ASICs on hardware RAID solutions.  There are two big
problems today with parity arrays.

The first is the IO load and latency of read-modify-write cycles which
occur when partial stripes are updated.  Most write IO is small and
random, typically comprising over 95% of writes for most typically
workloads, mail, file, LDAP, SQL servers for example. Streaming
applications such as video are an exception as they write full stripes.
Thus for every random write, you first must, at a minimum, read two
disks (data chunk/strip and parity chunk/strip) and write back to both
with the new data chunk and parity chunk.  This is with an optimized
algorithm.  The md/RAID driver has some optimizations to cut down on RMW
penalties.  Many hardware RAID solutions read then write the entire
stripe for scrubbing purposes (i.e. write all disks frequently so media
errors are caught sooner rather than later).  This is a data integrity
feature of higher end controllers.  This implementation is much slower
due to all the extra IO and head movement, but more reliable.

The second is that failed drive rebuilds take FOREVER as all disks are
being read in parallel and parity calculated for every stripe, just to
rebuild one disk.  Even a small count 2TB drive RAID6 array can take
12-24 hours to rebuild.  The recommended max array drive count for
RAID5/6 are 4 and 8 drives respectively.  One of the reasons for this
BCP is rebuild time.  With RAID10 rebuild time is a constant, as you're
simply copying all the sectors from one drive to another.  A 60x2TB
drive RAID10 rebuild will take about 5 hours with low normal workload IO
hitting the array.

 Further, striping across two mirrors will give increased
 performance that parity-based RAID cannot achieve. 

A parity array actually has superior read speed vs a RAID10 array of the
same total spindle count because there are more data spindles.  An 8
drive RAID6 has 6 data spindles, whereas an 8 drive RAID10 only has 4.
Write performance, however, as I mentioned, is an order of magnitude
slower due to RMW.

 Lastly, you can suffer
 any sort of disk failures, provided all mirrors in the stripe remains in
 tact.

You mean any number not sort.  Yes, with RAID10 you can lose half
the drives in the array as long as no two are in the same mirror pair.
I wouldn't bank on this though.  Many drive dropouts are not due to
problems with the drives, but with the backplanes and cabling.  When
that happens, if you've not staggered your mirrors across HBAs, cables,
and cabinets (which isn't possible with RAID HBAs), you may very well
lose two drives in the same mirror.

 1: http://zfsonlinux.org

 Just my $.02.

And that sums up the value of your ZFS on Linux recommendation, quite
well.  Being a fanboy is fine.  Run it yourself.  But please don't
recommend unsupported, out of tree, difficult for the average Debian
user to install, software, for a general purpose storage solution.

Good hardware RAID is relatively cheap, Linux has md/RAID which isn't
horrible for most workloads, and there are plenty of high quality Linux
filesystems to meet most needs, with EXT4 for casual stuff, JFS and XFS
for heavy duty, though XFS is a much better choice for many reasons; the
big one being that it's actively developed, whereas JFS is mostly in
maintenance only mode.

-- 
Stan


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/50c30205.2050...@hardwarefreak.com



Re: raid recomendation

2012-12-07 Thread Aaron Toponce
On Thu, Dec 06, 2012 at 01:18:38PM -0300, Roberto Scattini wrote:
 hi, i have a new dell r720 server with 5 600gb disks.
 his function will be a postgresql server (the size of the databases is
 really small with 600gb we should be fine for a long time).
 
 which raid configuration would you recommend?
 i was thinking in raid 5 with all five disks but i am not a expert.
 
 i prefer redundandcy against size (i mean, i can sacrifice space). and i
 dont want performance degradation for doing raid with an incorrect number
 of disks.

I'll be the first one in this thread to recommend ZFS [1]. With 5 disks, I
would personally do a RAID-1+0, with a hot spare. A RAID-1 will outperform
a parity-based RAID using the same disks every time, due to calculating the
parity. Further, striping across two mirrors will give increased
performance that parity-based RAID cannot achieve. Lastly, you can suffer
any sort of disk failures, provided all mirrors in the stripe remains in
tact.

1: http://zfsonlinux.org

If you must absolutely do a parity-based RAID, then I would suggest a
5-disk RAIDZ-1 without a hot spare. It's best practice to use the power
of two, plus parity for your number of disks. In this case, it will give
you the best performance, decent space, and allow for 1 disk failure.

Further, I would recommend the investment in two Intel 300-series SSDs. You
can then partition the SSDs giving 1 GB on each in a mirrored  ZIL, and the
rest to a striped L2ARC. For a PostgreSQL DB, you will see immensive
performance gains that you cannot achieve with Linux-based software RAID
and filesystems. And, because ZFS is also a volume manager, there is no
need for LVM and the cache troubles it's plagued with [2].

2: http://serverfault.com/questions/279571/lvm-dangers-and-caveats

If interested, I've been blogging on this very topic. You can see the
relevent posts to your setup here:

* Installing ZFS on Debian: http://pthree.org/?p=2357
* The ZIL: http://pthree.org/?p=2592
* The ZFS ARC: http://pthree.org/?p=2659

Just my $.02.

-- 
. o .   o . o   . . o   o . .   . o .
. . o   . o o   o . o   . o o   . . o
o o o   . o .   . o o   o o .   o o o


pgpDxZ39JLKBv.pgp
Description: PGP signature


raid recomendation

2012-12-06 Thread Roberto Scattini
hi, i have a new dell r720 server with 5 600gb disks.
his function will be a postgresql server (the size of the databases is
really small with 600gb we should be fine for a long time).

which raid configuration would you recommend?
i was thinking in raid 5 with all five disks but i am not a expert.

i prefer redundandcy against size (i mean, i can sacrifice space). and i
dont want performance degradation for doing raid with an incorrect number
of disks.

thanks in advance!

-- 
Roberto Scattini


Re: raid recomendation

2012-12-06 Thread Pedro Eugênio Rocha
On Thu, Dec 6, 2012 at 2:18 PM, Roberto Scattini roberto.scatt...@gmail.com
 wrote:

 hi, i have a new dell r720 server with 5 600gb disks.
 his function will be a postgresql server (the size of the databases is
 really small with 600gb we should be fine for a long time).

 which raid configuration would you recommend?
 i was thinking in raid 5 with all five disks but i am not a expert.


Hi Roberto,

A RAID 5 volume including the 5 drives should work fine for you. A question
you should consider when you are taking this decision is: how fast can you
replace a failed drive? You must know that in the meanwhile you'd be
vulnerable, since an additional failure could lead to data loss. If you
can't exchange the drive fast enough, maybe you should use an spare drive
or thinking about RAID 6 (I've never used it though).



 i prefer redundandcy against size (i mean, i can sacrifice space). and i
 dont want performance degradation for doing raid with an incorrect number
 of disks.


Another think you could consider is creating a separate volume for the OS.
Particularly, I'd go for a 2 disks RAID 1 for the OS and a 3 disks RAID 5
for the database, since capacity isn't a problem. This configuration
ensures that the database I/O traffic does not competes with the OS (when
swapping and stuff). But that also depends on your hardware configuration.


Best,



 thanks in advance!

 --
 Roberto Scattini


-- 
Pedro Eugênio Rocha


Re: raid recomendation

2012-12-06 Thread François TOURDE
Le 15680ième jour après Epoch,
Roberto Scattini écrivait:

 hi, i have a new dell r720 server with 5 600gb disks.
 his function will be a postgresql server (the size of the databases is
 really small with 600gb we should be fine for a long time).

 which raid configuration would you recommend?
 i was thinking in raid 5 with all five disks but i am not a expert.

 i prefer redundandcy against size (i mean, i can sacrifice space). and i
 dont want performance degradation for doing raid with an incorrect number
 of disks.

In this case, avoid RAID5 and choose RAID1 or RAID10. Use 4 disks on the
RAID10 array and the 5th as a spare one, for example.


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/8738zjrur7@tourde.org



Re: raid recomendation

2012-12-06 Thread Stefan
Hello,
for databases its recommended to use raid10, cause of the speed


regards 
Stefan




Re: raid recomendation

2012-12-06 Thread Federico Alberto Sayd

On 06/12/12 13:18, Roberto Scattini wrote:

hi, i have a new dell r720 server with 5 600gb disks.
his function will be a postgresql server (the size of the databases is 
really small with 600gb we should be fine for a long time).


which raid configuration would you recommend?
i was thinking in raid 5 with all five disks but i am not a expert.

i prefer redundandcy against size (i mean, i can sacrifice space). and 
i dont want performance degradation for doing raid with an incorrect 
number of disks.


thanks in advance!

--
Roberto Scattini

If you want performance and redundancy at cost to reduce your storage 
capacity think in RAID 10.


Regards

Federico


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/50c0d39a.2030...@uncu.edu.ar



Re: raid recomendation

2012-12-06 Thread Martin Steigerwald
Am Donnerstag, 6. Dezember 2012 schrieb Roberto Scattini:
 hi, i have a new dell r720 server with 5 600gb disks.
 his function will be a postgresql server (the size of the databases is
 really small with 600gb we should be fine for a long time).
 
 which raid configuration would you recommend?
 i was thinking in raid 5 with all five disks but i am not a expert.
 
 i prefer redundandcy against size (i mean, i can sacrifice space). and i
 dont want performance degradation for doing raid with an incorrect number
 of disks.

Additionally to what has been written by the other posters:

http://baarf.com

:)

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201212062022.04928.mar...@lichtvoll.de