Re: Raid over 48 disks ... for real now

2008-01-18 Thread michael

Quoting Norman Elton [EMAIL PROTECTED]:


I posed the question a few weeks ago about how to best accommodate
software RAID over an array of 48 disks (a Sun X4500 server, a.k.a.
Thumper). I appreciate all the suggestions.

Well, the hardware is here. It is indeed six Marvell 88SX6081 SATA
controllers, each with eight 1TB drives, for a total raw storage of
48TB. I must admit, it's quite impressive. And loud. More information
about the hardware is available online...

http://www.sun.com/servers/x64/x4500/arch-wp.pdf

It came loaded with Solaris, configured with ZFS. Things seemed to
work fine. I did not do any benchmarks, but I can revert to that
configuration if necessary.

Now I've loaded RHEL onto the box. For a first-shot, I've created one
RAID-5 array (+ 1 spare) on each of the controllers, then used LVM to
create a VolGroup across the arrays.

So now I'm trying to figure out what to do with this space. So far,
I've tested mke2fs on a 1TB and a 5TB LogVol.

I wish RHEL would support XFS/ZFS, but for now, I'm stuck with ext3.
Am I better off sticking with relatively small partitions (2-5 TB), or
should I crank up the block size and go for one big partition?


Impressive system. I'm curious to what the storage drives look like  
and how they attach to the server with that many disks?
Sounds like you have some time to play around before shoving it into  
production.

I wonder how long it would take to run an fsck on one large filesystem?

Cheers,
Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks ... for real now

2008-01-18 Thread Greg Cormier
 I wonder how long it would take to run an fsck on one large filesystem?

:)

I would imagine you'd have time to order a new system, build it, and
restore the backups before the fsck was done!
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks ... for real now

2008-01-18 Thread Norman Elton
It is quite a box. There's a picture of the box with the cover removed
on Sun's website:

http://www.sun.com/images/k3/k3_sunfirex4500_4.jpg

From the X4500 homepage, there's a gallery of additional pictures. The
drives drop in from the top. Massive fans channel air in the small
gaps between the drives. It doesn't look like there's much room
between the disks, but a lot of cold air gets sucked in the front, and
a lot of hot air comes out the back. So it must be doing its job :).

I have not tried a fsck on it yet. I'll probably setup a lot of 2TB
partitions rather than a single large partition. Then write the
software to handle storing data across many partitions.

Norman

On 1/18/08, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 Quoting Norman Elton [EMAIL PROTECTED]:

  I posed the question a few weeks ago about how to best accommodate
  software RAID over an array of 48 disks (a Sun X4500 server, a.k.a.
  Thumper). I appreciate all the suggestions.
 
  Well, the hardware is here. It is indeed six Marvell 88SX6081 SATA
  controllers, each with eight 1TB drives, for a total raw storage of
  48TB. I must admit, it's quite impressive. And loud. More information
  about the hardware is available online...
 
  http://www.sun.com/servers/x64/x4500/arch-wp.pdf
 
  It came loaded with Solaris, configured with ZFS. Things seemed to
  work fine. I did not do any benchmarks, but I can revert to that
  configuration if necessary.
 
  Now I've loaded RHEL onto the box. For a first-shot, I've created one
  RAID-5 array (+ 1 spare) on each of the controllers, then used LVM to
  create a VolGroup across the arrays.
 
  So now I'm trying to figure out what to do with this space. So far,
  I've tested mke2fs on a 1TB and a 5TB LogVol.
 
  I wish RHEL would support XFS/ZFS, but for now, I'm stuck with ext3.
  Am I better off sticking with relatively small partitions (2-5 TB), or
  should I crank up the block size and go for one big partition?

 Impressive system. I'm curious to what the storage drives look like
 and how they attach to the server with that many disks?
 Sounds like you have some time to play around before shoving it into
 production.
 I wonder how long it would take to run an fsck on one large filesystem?

 Cheers,
 Mike
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks ... for real now

2008-01-18 Thread Jon Lewis

On Thu, 17 Jan 2008, Janek Kozicki wrote:


I wish RHEL would support XFS/ZFS, but for now, I'm stuck with ext3.


there is ext4 (or ext4dev) - it's an ext3 modified to support 1024 PB size
(1048576 TB). You could check if it's feasible. Personally I'd always
stick with ext2/ext3/ext4 since it is most widely used and thus has
the best recovery tools.


Something else to keep in mind...XFS fs repair tools require large amounts 
of memory.  If you were to create one or a few really huge fs's on this 
array, you might end up with fs's which can't be repaired because you 
don't have or even can't get a machine with enough RAM for the job...not 
to mention the amount of time it would take.


--
 Jon Lewis   |  I route
 Senior Network Engineer |  therefore you are
 Atlantic Net|
_ http://www.lewis.org/~jlewis/pgp for PGP public key_
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Raid over 48 disks ... for real now

2008-01-17 Thread Norman Elton
I posed the question a few weeks ago about how to best accommodate
software RAID over an array of 48 disks (a Sun X4500 server, a.k.a.
Thumper). I appreciate all the suggestions.

Well, the hardware is here. It is indeed six Marvell 88SX6081 SATA
controllers, each with eight 1TB drives, for a total raw storage of
48TB. I must admit, it's quite impressive. And loud. More information
about the hardware is available online...

http://www.sun.com/servers/x64/x4500/arch-wp.pdf

It came loaded with Solaris, configured with ZFS. Things seemed to
work fine. I did not do any benchmarks, but I can revert to that
configuration if necessary.

Now I've loaded RHEL onto the box. For a first-shot, I've created one
RAID-5 array (+ 1 spare) on each of the controllers, then used LVM to
create a VolGroup across the arrays.

So now I'm trying to figure out what to do with this space. So far,
I've tested mke2fs on a 1TB and a 5TB LogVol.

I wish RHEL would support XFS/ZFS, but for now, I'm stuck with ext3.
Am I better off sticking with relatively small partitions (2-5 TB), or
should I crank up the block size and go for one big partition?

Thoughts?

Norman Elton
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks ... for real now

2008-01-17 Thread Norman Elton
 Hi, sounds like a monster server. I am interested in how you will make
 the space useful to remote machines- iscsi? this is what I am
 researching currently.

Yes, it's a honker of a box. It will be collecting data from various
collector servers. The plan right now is to collect the file to
binary files using a daemon (already running on a smaller box), then
make the last 30/60/90/?? days available in a database that is
populated from these files. If we need to gather older data, then the
individual files must be consulted locally.

So, in production, I would probably setup the database partition on
it's own set of 6 disks, then dedicate the rest to handling/archiving
the raw binary files. These files are small (a few MB each), as they
get rotated every five minutes.

Hope this makes sense, and provides a little background info on what
we're trying to do.

Norman
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-25 Thread pg_mh
 On Wed, 19 Dec 2007 07:28:20 +1100, Neil Brown
 [EMAIL PROTECTED] said:

[ ... what to do with 48 drive Sun Thumpers ... ]

neilb I wouldn't create a raid5 or raid6 on all 48 devices.
neilb RAID5 only survives a single device failure and with that
neilb many devices, the chance of a second failure before you
neilb recover becomes appreciable.

That's just one of the many problems, other are:

* If a drive fails, rebuild traffic is going to hit hard, with
  reading in parallel 47 blocks to compute a new 48th.

* With a parity strip length of 48 it will be that much harder
  to avoid read-modify before write, as it will be avoidable
  only for writes of at least 48 blocks aligned on 48 block
  boundaries. And reading 47 blocks to write one is going to be
  quite painful.

[ ... ]

neilb RAID10 would be a good option if you are happy wit 24
neilb drives worth of space. [ ... ]

That sounds like the only feasible option (except for the 3
drive case in most cases). Parity RAID does not scale much
beyond 3-4 drives.

neilb Alternately, 8 6drive RAID5s or 6 8raid RAID6s, and use
neilb RAID0 to combine them together. This would give you
neilb adequate reliability and performance and still a large
neilb amount of storage space.

That sounds optimistic to me: the reason to do a RAID50 of
8x(5+1) can only be to have a single filesystem, else one could
have 8 distinct filesystems each with a subtree of the whole.
With a single filesystem the failure of any one of the 8 RAID5
components of the RAID0 will cause the loss of the whole lot.

So in the 47+1 case a loss of any two drives would lead to
complete loss; in the 8x(5+1) case only a loss of two drives in
the same RAID5 will.

It does not sound like a great improvement to me (especially
considering the thoroughly inane practice of building arrays out
of disks of the same make and model taken out of the same box).

There are also modest improvements in the RMW strip size and in
the cost of a rebuild after a single drive loss. Probably the
reduction in the RMW strip size is the best improvement.

Anyhow, let's assume 0.5TB drives; with a 47+1 we get a single
23.5TB filesystem, and with 8*(5+1) we get a 20TB filesystem.
With current filesystem technology either size is worrying, for
example as to time needed for an 'fsck'.

In practice RAID5 beyond 3-4 drives seems only useful for almost
read-only filesystems where restoring from backups is quick and
easy, never mind the 47+1 case or the 8x(5+1) one, and I think
that giving some credit even to the latter arrangement is not
quite right...
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-25 Thread Bill Davidsen

Peter Grandi wrote:

On Wed, 19 Dec 2007 07:28:20 +1100, Neil Brown
[EMAIL PROTECTED] said:



[ ... what to do with 48 drive Sun Thumpers ... ]

neilb I wouldn't create a raid5 or raid6 on all 48 devices.
neilb RAID5 only survives a single device failure and with that
neilb many devices, the chance of a second failure before you
neilb recover becomes appreciable.

That's just one of the many problems, other are:

* If a drive fails, rebuild traffic is going to hit hard, with
  reading in parallel 47 blocks to compute a new 48th.

* With a parity strip length of 48 it will be that much harder
  to avoid read-modify before write, as it will be avoidable
  only for writes of at least 48 blocks aligned on 48 block
  boundaries. And reading 47 blocks to write one is going to be
  quite painful.

[ ... ]

neilb RAID10 would be a good option if you are happy wit 24
neilb drives worth of space. [ ... ]

That sounds like the only feasible option (except for the 3
drive case in most cases). Parity RAID does not scale much
beyond 3-4 drives.

neilb Alternately, 8 6drive RAID5s or 6 8raid RAID6s, and use
neilb RAID0 to combine them together. This would give you
neilb adequate reliability and performance and still a large
neilb amount of storage space.

That sounds optimistic to me: the reason to do a RAID50 of
8x(5+1) can only be to have a single filesystem, else one could
have 8 distinct filesystems each with a subtree of the whole.
With a single filesystem the failure of any one of the 8 RAID5
components of the RAID0 will cause the loss of the whole lot.

So in the 47+1 case a loss of any two drives would lead to
complete loss; in the 8x(5+1) case only a loss of two drives in
the same RAID5 will.

It does not sound like a great improvement to me (especially
considering the thoroughly inane practice of building arrays out
of disks of the same make and model taken out of the same box).
  


Quality control just isn't that good that same box make a big 
difference, assuming that you have an appropriate number of hot spares 
online. Note that I said big difference, is there some clustering of 
failures? Some, but damn little. A few years ago I was working with 
multiple 6TB machines and 20+ 1TB machines, all using small, fast, 
drives in RAID5E. I can't remember a case where a drive failed before 
rebuild was complete, and only one or two where there was a failure to 
degraded mode before the hot spare was replaced.


That said, RAID5E typically can rebuild a lot faster than a typical hot 
spare as a unit drive, at least for any given impact on performance. 
This undoubtedly reduce our exposure time.

There are also modest improvements in the RMW strip size and in
the cost of a rebuild after a single drive loss. Probably the
reduction in the RMW strip size is the best improvement.

Anyhow, let's assume 0.5TB drives; with a 47+1 we get a single
23.5TB filesystem, and with 8*(5+1) we get a 20TB filesystem.
With current filesystem technology either size is worrying, for
example as to time needed for an 'fsck'.
  


Given that someone is putting a typical filesystem full of small files 
on a big raid, I agree. But fsck with large files is pretty fast on a 
given filesystem (200GB files on a 6TB ext3, for instance), due to the 
small number of inodes in play. While the bitmap resolution is a factor, 
it's pretty linear, fsck with lots of files gets really slow. And let's 
face it, the objective of raid is to avoid doing that fsck in the first 
place ;-)


--
Bill Davidsen [EMAIL PROTECTED]
 Woe unto the statesman who makes war without a reason that will still
 be valid when the war is over... Otto von Bismark 



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-21 Thread Leif Nixon
Norman Elton [EMAIL PROTECTED] writes:

 We're investigating the possibility of running Linux (RHEL) on top of
 Sun's X4500 Thumper box:

 http://www.sun.com/servers/x64/x4500/

I think BNL's evalation of Solaris/ZFS vs. Linux/MD on a thumper
might be of interest:

  
http://hepix.caspur.it/storage/hep_pdf/2007/Spring/Petkus_HEPiX_Spring06.storageeval.pdf

-- 
Leif Nixon   -Systems expert

National Supercomputer Centre-  Linkoping University

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-21 Thread Leif Nixon
Mattias Wadenstein [EMAIL PROTECTED] writes:

 There are those that have run Linux MD RAID on thumpers before. I
 vaguely recall some driver issues (unrelated to MD) that made it less
 suitable than solaris, but that might be fixed in recent kernels.

I think that was mainly an issue for people trying to squeeze
Scientific Linux 3 onto their thumpers.

-- 
Leif Nixon   -Systems expert

National Supercomputer Centre-  Linkoping University

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-20 Thread Thiemo Nagel

Bill Davidsen wrote:

16k read64k write
  chunk
  sizeRAID 5RAID 6RAID 5RAID 6
  128k492497268270
  256k615530288270
  512k625607230174
  1024k   65062017075
  


What is your stripe cache size?


I didn't fiddle with the default when I did these tests.

Now (with 256k chunk size) I had

# cat stripe_cache_size
256

but increasing that to 1024 didn't show a noticeable improvement for 
reading.  Still around 550MB/s.


Kind regards,

Thiemo
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-20 Thread Bill Davidsen

Thiemo Nagel wrote:

Bill Davidsen wrote:

16k read64k write
  chunk
  sizeRAID 5RAID 6RAID 5RAID 6
  128k492497268270
  256k615530288270
  512k625607230174
  1024k   65062017075
  


What is your stripe cache size?


I didn't fiddle with the default when I did these tests.

Now (with 256k chunk size) I had

# cat stripe_cache_size
256

but increasing that to 1024 didn't show a noticeable improvement for 
reading.  Still around 550MB/s.


You can use blockdev to raise the readahead, either on the drives or the 
array. That may make a difference, I use 4-8MB on the drive, more on the 
array depending on how I use it.


Kind regards,

Thiemo




--
Bill Davidsen [EMAIL PROTECTED]
 Woe unto the statesman who makes war without a reason that will still
 be valid when the war is over... Otto von Bismark 



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-19 Thread Mattias Wadenstein

On Wed, 19 Dec 2007, Neil Brown wrote:


On Tuesday December 18, [EMAIL PROTECTED] wrote:

We're investigating the possibility of running Linux (RHEL) on top of
Sun's X4500 Thumper box:

http://www.sun.com/servers/x64/x4500/

Basically, it's a server with 48 SATA hard drives. No hardware RAID.
It's designed for Sun's ZFS filesystem.

So... we're curious how Linux will handle such a beast. Has anyone run
MD software RAID over so many disks? Then piled LVM/ext3 on top of
that? Any suggestions?


There are those that have run Linux MD RAID on thumpers before. I vaguely 
recall some driver issues (unrelated to MD) that made it less suitable 
than solaris, but that might be fixed in recent kernels.



Alternately, 8 6drive RAID5s or 6 8raid RAID6s, and use RAID0 to
combine them together.  This would give you adequate reliability and
performance and still a large amount of storage space.


My personal suggestion would be 5 9-disk raid6s, one raid1 root mirror and 
one hot spare. Then raid0, lvm, or separate filesystem on those 5 raidsets 
for data, depending on your needs.


You get almost as much data space as with the 6 8-disk raid6s, and have a 
separate pair of disks for all the small updates (logging, metadata, etc), 
so this makes alot of sense if most of the data is bulk file access.


/Mattias Wadenstein
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-19 Thread Russell Smith

Guy Watkins wrote:

} -Original Message-
} From: [EMAIL PROTECTED] [mailto:linux-raid-
} [EMAIL PROTECTED] On Behalf Of Brendan Conoboy
} Sent: Tuesday, December 18, 2007 3:36 PM
} To: Norman Elton
} Cc: linux-raid@vger.kernel.org
} Subject: Re: Raid over 48 disks
} 
} Norman Elton wrote:

}  We're investigating the possibility of running Linux (RHEL) on top of
}  Sun's X4500 Thumper box:
} 
}  http://www.sun.com/servers/x64/x4500/
} 
} Neat- 6 8 port SATA controllers!  It'll be worth checking to be sure

} each controller has equal bandwidth.  If some controllers are on slower
} buses than others you may want to consider that and balance the md
} device layout.

Assuming the 6 controllers are equal, I would make 3 16 disk RAID6 arrays
using 2 disks from each controller.  That way any 1 controller can fail and
your system will still be running.  6 disks will be used for redundancy.

Or 6 8 disk RAID6 arrays using 1 disk from each controller).  That way any 2
controllers can fail and your system will still be running.  12 disks will
be used for redundancy.  Might be too excessive!

Combine them into a RAID0 array.

Guy

Sounds interesting!

Just out of interest, whats stopping you from using Solaris?

Though, I'm curious how md will compare to ZFS performance wise. There 
is some interesting configuration info / advice for Solaris here: 
http://www.solarisinternals.com/wiki/index.php/ZFS_Configuration_Guide 
esp for the X4500.



Russell
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-19 Thread Bill Davidsen

Thiemo Nagel wrote:

Performance of the raw device is fair:
# dd if=/dev/md2 of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 15.6071 seconds, 550 MB/s

Somewhat less through ext3 (created with -E stride=64):
# dd if=largetestfile of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 26.4103 seconds, 325 MB/s


Quite slow?

10 disks (raptors) raid 5 on regular sata controllers:

# dd if=/dev/md3 of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 10.718 seconds, 801 MB/s

# dd if=bigfile of=/dev/zero bs=128k count=64k
3640379392 bytes (3.6 GB) copied, 6.58454 seconds, 553 MB/s


Interesting.  Any ideas what could be the reason?  How much do you get 
from a single drive?  -- The Samsung HD501LJ that I'm using gives 
~84MB/s when reading from the beginning of the disk.


With RAID 5 I'm getting slightly better results (though I really 
wonder why, since naively I would expect identical read performance) 
but that does only account for a small part of the difference:


16k read64k write
  
chunk
  
sizeRAID 5RAID 6RAID 5RAID 6
  
128k492497268270
  
256k615530288270
  
512k625607230174
  
1024k   65062017075
  


What is your stripe cache size?

--
Bill Davidsen [EMAIL PROTECTED]
 Woe unto the statesman who makes war without a reason that will still
 be valid when the war is over... Otto von Bismark 



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-19 Thread Justin Piszcz



On Wed, 19 Dec 2007, Bill Davidsen wrote:


Thiemo Nagel wrote:

Performance of the raw device is fair:
# dd if=/dev/md2 of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 15.6071 seconds, 550 MB/s

Somewhat less through ext3 (created with -E stride=64):
# dd if=largetestfile of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 26.4103 seconds, 325 MB/s


Quite slow?

10 disks (raptors) raid 5 on regular sata controllers:

# dd if=/dev/md3 of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 10.718 seconds, 801 MB/s

# dd if=bigfile of=/dev/zero bs=128k count=64k
3640379392 bytes (3.6 GB) copied, 6.58454 seconds, 553 MB/s


Interesting.  Any ideas what could be the reason?  How much do you get from 
a single drive?  -- The Samsung HD501LJ that I'm using gives ~84MB/s when 
reading from the beginning of the disk.


With RAID 5 I'm getting slightly better results (though I really wonder 
why, since naively I would expect identical read performance) but that does 
only account for a small part of the difference:


16k read64k write
  chunk
  sizeRAID 5RAID 6RAID 5RAID 6
  128k492497268270
  256k615530288270
  512k625607230174
  1024k   65062017075



What is your stripe cache size?


# Set stripe-cache_size for RAID5.
echo Setting stripe_cache_size to 16 MiB for /dev/md3
echo 16384  /sys/block/md3/md/stripe_cache_size

Justin.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-19 Thread Bill Davidsen

Mattias Wadenstein wrote:

On Wed, 19 Dec 2007, Neil Brown wrote:


On Tuesday December 18, [EMAIL PROTECTED] wrote:

We're investigating the possibility of running Linux (RHEL) on top of
Sun's X4500 Thumper box:

http://www.sun.com/servers/x64/x4500/

Basically, it's a server with 48 SATA hard drives. No hardware RAID.
It's designed for Sun's ZFS filesystem.

So... we're curious how Linux will handle such a beast. Has anyone run
MD software RAID over so many disks? Then piled LVM/ext3 on top of
that? Any suggestions?


There are those that have run Linux MD RAID on thumpers before. I 
vaguely recall some driver issues (unrelated to MD) that made it less 
suitable than solaris, but that might be fixed in recent kernels.



Alternately, 8 6drive RAID5s or 6 8raid RAID6s, and use RAID0 to
combine them together.  This would give you adequate reliability and
performance and still a large amount of storage space.


My personal suggestion would be 5 9-disk raid6s, one raid1 root mirror 
and one hot spare. Then raid0, lvm, or separate filesystem on those 5 
raidsets for data, depending on your needs.


Other than thinking raid-10 better than  raid-1for performance, I like it.


You get almost as much data space as with the 6 8-disk raid6s, and 
have a separate pair of disks for all the small updates (logging, 
metadata, etc), so this makes alot of sense if most of the data is 
bulk file access.


--
Bill Davidsen [EMAIL PROTECTED]
 Woe unto the statesman who makes war without a reason that will still
 be valid when the war is over... Otto von Bismark 



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Raid over 48 disks

2007-12-18 Thread Norman Elton
We're investigating the possibility of running Linux (RHEL) on top of  
Sun's X4500 Thumper box:


http://www.sun.com/servers/x64/x4500/

Basically, it's a server with 48 SATA hard drives. No hardware RAID.  
It's designed for Sun's ZFS filesystem.


So... we're curious how Linux will handle such a beast. Has anyone run  
MD software RAID over so many disks? Then piled LVM/ext3 on top of  
that? Any suggestions?


Are we crazy to think this is even possible?

Thanks!

Norman Elton
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Justin Piszcz



On Tue, 18 Dec 2007, Norman Elton wrote:

We're investigating the possibility of running Linux (RHEL) on top of Sun's 
X4500 Thumper box:


http://www.sun.com/servers/x64/x4500/

Basically, it's a server with 48 SATA hard drives. No hardware RAID. It's 
designed for Sun's ZFS filesystem.


So... we're curious how Linux will handle such a beast. Has anyone run MD 
software RAID over so many disks? Then piled LVM/ext3 on top of that? Any 
suggestions?


Are we crazy to think this is even possible?

Thanks!

Norman Elton
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


It sounds VERY fun and exciting if you ask me!  The most disks I've used 
when testing SW RAID was 10 with various raid settings.  With that many 
drives you'd want RAID6 or RAID10 for sure incase more than one failed at 
the same time and definitely XFS/JFS/EXT4(?) as EXT3 is capped to 8TB.


I'd be curious what kind of aggregate bandwidth you can get off of it with 
that many drives.


Justin.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Robin Hill
On Tue Dec 18, 2007 at 12:29:27PM -0500, Norman Elton wrote:

 We're investigating the possibility of running Linux (RHEL) on top of Sun's 
 X4500 Thumper box:

 http://www.sun.com/servers/x64/x4500/

 Basically, it's a server with 48 SATA hard drives. No hardware RAID. It's 
 designed for Sun's ZFS filesystem.

 So... we're curious how Linux will handle such a beast. Has anyone run MD 
 software RAID over so many disks? Then piled LVM/ext3 on top of that? Any 
 suggestions?

 Are we crazy to think this is even possible?

The most I've done is 28 drives in RAID-10 (SCSI drives, with the array
formatted as XFS).  That keeps failing one drive, but I've not had time
to give the drive a full test yet to confirm it's a drive issue.  It's
been running quite happily (under pretty heavy database load) on 27
disks for a couple of months now though.

Cheers,
Robin
-- 
 ___
( ' } |   Robin Hill[EMAIL PROTECTED] |
   / / )  | Little Jim says |
  // !!   |  He fallen in de water !! |


pgpFz4s5k2eD3.pgp
Description: PGP signature


Re: Raid over 48 disks

2007-12-18 Thread Thiemo Nagel

Dear Norman,

So... we're curious how Linux will handle such a beast. Has anyone run 
MD software RAID over so many disks? Then piled LVM/ext3 on top of 
that? Any suggestions?


Are we crazy to think this is even possible?


I'm running 22x 500GB disks attached to RocketRaid2340 and NFORCE-MCP55
onboard controllers on an Athlon DC 5000+ with 1GB RAM:

9746150400 blocks super 1.2 level 6, 256k chunk, algorithm 2 [22/22]

Performance of the raw device is fair:
# dd if=/dev/md2 of=/dev/zero bs=128k count=64k
65536+0 records in
65536+0 records out
8589934592 bytes (8.6 GB) copied, 15.6071 seconds, 550 MB/s

Somewhat less through ext3 (created with -E stride=64):
# dd if=largetestfile of=/dev/zero bs=128k count=64k
65536+0 records in
65536+0 records out
8589934592 bytes (8.6 GB) copied, 26.4103 seconds, 325 MB/s

There were no problems up to now.  (mkfs.ext3 wants -F to create a 
filesystem larger than 8TB.  The hard maximum is 16TB, so you will need 
to create partitions, if your drives are larger than 350GB...)


Kind regards,

Thiemo Nagel


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Norman Elton

Thiemo --

I'm not familiar with RocketRaid. Is it handling the RAID for you, or  
are you using MD?


Thanks, all, for your feedback! I'm still surprised nobody has tried  
this on one of these Sun boxes yet. I've signed up for some demo  
hardware. I'll post what I find.


Norman


On Dec 18, 2007, at 2:34 PM, Thiemo Nagel wrote:


Dear Norman,

So... we're curious how Linux will handle such a beast. Has anyone  
run MD software RAID over so many disks? Then piled LVM/ext3 on  
top of that? Any suggestions?


Are we crazy to think this is even possible?


I'm running 22x 500GB disks attached to RocketRaid2340 and NFORCE- 
MCP55

onboard controllers on an Athlon DC 5000+ with 1GB RAM:

9746150400 blocks super 1.2 level 6, 256k chunk, algorithm 2 [22/22]

Performance of the raw device is fair:
# dd if=/dev/md2 of=/dev/zero bs=128k count=64k
65536+0 records in
65536+0 records out
8589934592 bytes (8.6 GB) copied, 15.6071 seconds, 550 MB/s

Somewhat less through ext3 (created with -E stride=64):
# dd if=largetestfile of=/dev/zero bs=128k count=64k
65536+0 records in
65536+0 records out
8589934592 bytes (8.6 GB) copied, 26.4103 seconds, 325 MB/s

There were no problems up to now.  (mkfs.ext3 wants -F to create a  
filesystem larger than 8TB.  The hard maximum is 16TB, so you will  
need to create partitions, if your drives are larger than 350GB...)


Kind regards,

Thiemo Nagel




-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Thiemo Nagel

Dear Norman,

I'm not familiar with RocketRaid. Is it handling the RAID for you, or 
are you using MD?


I'm using md.  The controller is in a mode that exports all drives 
individually.


Kind regards,

Thiemo
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Justin Piszcz



On Tue, 18 Dec 2007, Thiemo Nagel wrote:


Dear Norman,

So... we're curious how Linux will handle such a beast. Has anyone run MD 
software RAID over so many disks? Then piled LVM/ext3 on top of that? Any 
suggestions?


Are we crazy to think this is even possible?


I'm running 22x 500GB disks attached to RocketRaid2340 and NFORCE-MCP55
onboard controllers on an Athlon DC 5000+ with 1GB RAM:

9746150400 blocks super 1.2 level 6, 256k chunk, algorithm 2 [22/22]

Performance of the raw device is fair:
# dd if=/dev/md2 of=/dev/zero bs=128k count=64k
65536+0 records in
65536+0 records out
8589934592 bytes (8.6 GB) copied, 15.6071 seconds, 550 MB/s

Somewhat less through ext3 (created with -E stride=64):
# dd if=largetestfile of=/dev/zero bs=128k count=64k
65536+0 records in
65536+0 records out
8589934592 bytes (8.6 GB) copied, 26.4103 seconds, 325 MB/s

There were no problems up to now.  (mkfs.ext3 wants -F to create a filesystem 
larger than 8TB.  The hard maximum is 16TB, so you will need to create 
partitions, if your drives are larger than 350GB...)


Kind regards,

Thiemo Nagel


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Quite slow?

10 disks (raptors) raid 5 on regular sata controllers:

# dd if=/dev/md3 of=/dev/zero bs=128k count=64k
65536+0 records in
65536+0 records out
8589934592 bytes (8.6 GB) copied, 10.718 seconds, 801 MB/s

# dd if=bigfile of=/dev/zero bs=128k count=64k
27773+1 records in
27773+1 records out
3640379392 bytes (3.6 GB) copied, 6.58454 seconds, 553 MB/s


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Brendan Conoboy

Norman Elton wrote:
We're investigating the possibility of running Linux (RHEL) on top of 
Sun's X4500 Thumper box:


http://www.sun.com/servers/x64/x4500/


Neat- 6 8 port SATA controllers!  It'll be worth checking to be sure 
each controller has equal bandwidth.  If some controllers are on slower 
buses than others you may want to consider that and balance the md 
device layout.


So... we're curious how Linux will handle such a beast. Has anyone run 
MD software RAID over so many disks? Then piled LVM/ext3 on top of that? 
Any suggestions?


There used to be a maximum number of devices allowed in a single md 
device.  Not sure if that is still the case.


With this many drives you would be well advised to make smaller raid 
devices then combine them into a larger md device (or via lvm, etc). 
Consider a write with a 48 device raid5- the system may need to read 
blocks from all those drives before a single write!


If it were my system, all ports were equally well connected, I'd create 
3 16 drive RAID5's with 1 hot spare, then combine them via raid 0 or 
lvm.  That's just my usage scenario, though (modest reliability, 
excellent read speed, modest write speed).


If you put ext3 on time, remember to use the stride option when making 
the filesystem.



Are we crazy to think this is even possible?


Crazy, possible, and fun!

--
Brendan Conoboy / Red Hat, Inc. / [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Neil Brown
On Tuesday December 18, [EMAIL PROTECTED] wrote:
 We're investigating the possibility of running Linux (RHEL) on top of  
 Sun's X4500 Thumper box:
 
 http://www.sun.com/servers/x64/x4500/
 
 Basically, it's a server with 48 SATA hard drives. No hardware RAID.  
 It's designed for Sun's ZFS filesystem.
 
 So... we're curious how Linux will handle such a beast. Has anyone run  
 MD software RAID over so many disks? Then piled LVM/ext3 on top of  
 that? Any suggestions?
 
 Are we crazy to think this is even possible?

Certainly possible.
The default metadata is limited to 28 devices, but with
--metadata=1

you can easily use all 48 drives or more in the one array.  I'm not
sure if you would want to though.

If you just wanted an enormous scratch space and were happy to lose
all your data on a drive failure, then you could make a raid0 across
all the drives which should work perfectly and give you lots of
space.  But that probably isn't what you want.

I wouldn't create a raid5 or raid6 on all 48 devices.
RAID5 only survives a single device failure and with that many
devices, the chance of a second failure before you recover becomes
appreciable.

RAID6 would be much more reliable, but probably much slower.  RAID6
always needs to read or write every block in a stripe (i.e. it always
uses reconstruct-write to generate the P and Q blocks,  It never does
a read-modify-write like raid5 does).  This means that every write
touches every device so you have less possibility for parallelism
among your many drives.
It might be instructive to try it out though.

RAID10 would be a good option if you are happy wit 24 drives worth of
space.  I would probably choose a largish chunk size (256K) and use
the 'offset' layout.

Alternately, 8 6drive RAID5s or 6 8raid RAID6s, and use RAID0 to
combine them together.  This would give you adequate reliability and
performance and still a large amount of storage space.

Have fun!!!

NeilBrown

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Thiemo Nagel

Performance of the raw device is fair:
# dd if=/dev/md2 of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 15.6071 seconds, 550 MB/s

Somewhat less through ext3 (created with -E stride=64):
# dd if=largetestfile of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 26.4103 seconds, 325 MB/s


Quite slow?

10 disks (raptors) raid 5 on regular sata controllers:

# dd if=/dev/md3 of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 10.718 seconds, 801 MB/s

# dd if=bigfile of=/dev/zero bs=128k count=64k
3640379392 bytes (3.6 GB) copied, 6.58454 seconds, 553 MB/s


Interesting.  Any ideas what could be the reason?  How much do you get 
from a single drive?  -- The Samsung HD501LJ that I'm using gives 
~84MB/s when reading from the beginning of the disk.


With RAID 5 I'm getting slightly better results (though I really wonder 
why, since naively I would expect identical read performance) but that 
does only account for a small part of the difference:


16k read64k write
chunk
sizeRAID 5  RAID 6  RAID 5  RAID 6
128k492 497 268 270
256k615 530 288 270
512k625 607 230 174
1024k   650 620 170 75

Kind regards,

Thiemo
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Jon Nelson
On 12/18/07, Thiemo Nagel [EMAIL PROTECTED] wrote:
  Performance of the raw device is fair:
  # dd if=/dev/md2 of=/dev/zero bs=128k count=64k
  8589934592 bytes (8.6 GB) copied, 15.6071 seconds, 550 MB/s
 
  Somewhat less through ext3 (created with -E stride=64):
  # dd if=largetestfile of=/dev/zero bs=128k count=64k
  8589934592 bytes (8.6 GB) copied, 26.4103 seconds, 325 MB/s
 
  Quite slow?
 
  10 disks (raptors) raid 5 on regular sata controllers:
 
  # dd if=/dev/md3 of=/dev/zero bs=128k count=64k
  8589934592 bytes (8.6 GB) copied, 10.718 seconds, 801 MB/s
 
  # dd if=bigfile of=/dev/zero bs=128k count=64k
  3640379392 bytes (3.6 GB) copied, 6.58454 seconds, 553 MB/s

 Interesting.  Any ideas what could be the reason?  How much do you get
 from a single drive?  -- The Samsung HD501LJ that I'm using gives
 ~84MB/s when reading from the beginning of the disk.

 With RAID 5 I'm getting slightly better results (though I really wonder
 why, since naively I would expect identical read performance) but that
 does only account for a small part of the difference:

 16k read64k write
 chunk
 sizeRAID 5  RAID 6  RAID 5  RAID 6
 128k492 497 268 270
 256k615 530 288 270
 512k625 607 230 174
 1024k   650 620 170 75

It strikes me that these numbers are meaningless without knowing if
that is actual data-to-disk or data-to-memcache-and-some-to-disk-too.
Later versions of 'dd' offer 'conv=fdatasync' which is really handy
(call fdatasync on the output file, syncing JUST the one file, right
before close). Otherwise, oflags=direct will (try) to bypass the
page/block cache.

I can get really impressive numbers, too (over 200MB/s on a single
disk capable of 70MB/s) when I (mis)use dd without fdatasync, et al.

The variation in reported performance can be really huge without
understanding that you aren't actually testing the DISK I/O but *some*
disk I/O and *some* memory caching.




-- 
Jon
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Justin Piszcz



On Tue, 18 Dec 2007, Thiemo Nagel wrote:


Performance of the raw device is fair:
# dd if=/dev/md2 of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 15.6071 seconds, 550 MB/s

Somewhat less through ext3 (created with -E stride=64):
# dd if=largetestfile of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 26.4103 seconds, 325 MB/s


Quite slow?

10 disks (raptors) raid 5 on regular sata controllers:

# dd if=/dev/md3 of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 10.718 seconds, 801 MB/s

# dd if=bigfile of=/dev/zero bs=128k count=64k
3640379392 bytes (3.6 GB) copied, 6.58454 seconds, 553 MB/s


Interesting.  Any ideas what could be the reason?  How much do you get from a 
single drive?  -- The Samsung HD501LJ that I'm using gives ~84MB/s when 
reading from the beginning of the disk.


With RAID 5 I'm getting slightly better results (though I really wonder why, 
since naively I would expect identical read performance) but that does only 
account for a small part of the difference:


16k read64k write
chunk
sizeRAID 5  RAID 6  RAID 5  RAID 6
128k492 497 268 270
256k615 530 288 270
512k625 607 230 174
1024k   650 620 170 75

Kind regards,

Thiemo



# dd if=/dev/sdc of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 13.8108 seconds, 77.7 MB/s

With more than 2x the drives I'd think you'd have faster speed, perhaps 
the contoller is the problem?


I am using ICH8R (but the raid within linux) and 2 port SATA cards, each 
has their own dedicated bandwidth via PCI-e bus.


I have also tried (on 3ware controllers exporting as JBOD etc, sw RAID5) 
with 10 disks, I saw similar performance with read but not write.


Justin.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Thiemo Nagel

16k read64k write
chunk
sizeRAID 5  RAID 6  RAID 5  RAID 6
128k492 497 268 270
256k615 530 288 270
512k625 607 230 174
1024k   650 620 170 75


It strikes me that these numbers are meaningless without knowing if
that is actual data-to-disk or data-to-memcache-and-some-to-disk-too.
Later versions of 'dd' offer 'conv=fdatasync' which is really handy
(call fdatasync on the output file, syncing JUST the one file, right
before close). Otherwise, oflags=direct will (try) to bypass the
page/block cache.

I can get really impressive numbers, too (over 200MB/s on a single
disk capable of 70MB/s) when I (mis)use dd without fdatasync, et al.

The variation in reported performance can be really huge without
understanding that you aren't actually testing the DISK I/O but *some*
disk I/O and *some* memory caching.


I did these benchmarks with 32GB of data on a machine with 1GB of RAM, 
therefore the memory cache contribution should be small.


Kind regards,

Thiemo
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid over 48 disks

2007-12-18 Thread Justin Piszcz



On Tue, 18 Dec 2007, Jon Nelson wrote:


On 12/18/07, Thiemo Nagel [EMAIL PROTECTED] wrote:

Performance of the raw device is fair:
# dd if=/dev/md2 of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 15.6071 seconds, 550 MB/s

Somewhat less through ext3 (created with -E stride=64):
# dd if=largetestfile of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 26.4103 seconds, 325 MB/s


Quite slow?

10 disks (raptors) raid 5 on regular sata controllers:

# dd if=/dev/md3 of=/dev/zero bs=128k count=64k
8589934592 bytes (8.6 GB) copied, 10.718 seconds, 801 MB/s

# dd if=bigfile of=/dev/zero bs=128k count=64k
3640379392 bytes (3.6 GB) copied, 6.58454 seconds, 553 MB/s


Interesting.  Any ideas what could be the reason?  How much do you get
from a single drive?  -- The Samsung HD501LJ that I'm using gives
~84MB/s when reading from the beginning of the disk.

With RAID 5 I'm getting slightly better results (though I really wonder
why, since naively I would expect identical read performance) but that
does only account for a small part of the difference:

16k read64k write
chunk
sizeRAID 5  RAID 6  RAID 5  RAID 6
128k492 497 268 270
256k615 530 288 270
512k625 607 230 174
1024k   650 620 170 75


It strikes me that these numbers are meaningless without knowing if
that is actual data-to-disk or data-to-memcache-and-some-to-disk-too.
Later versions of 'dd' offer 'conv=fdatasync' which is really handy
(call fdatasync on the output file, syncing JUST the one file, right
before close). Otherwise, oflags=direct will (try) to bypass the
page/block cache.

I can get really impressive numbers, too (over 200MB/s on a single
disk capable of 70MB/s) when I (mis)use dd without fdatasync, et al.

The variation in reported performance can be really huge without
understanding that you aren't actually testing the DISK I/O but *some*
disk I/O and *some* memory caching.


Ok-- How's this for caching, a DD over the entire RAID device:

$ /usr/bin/time dd if=/dev/zero of=file bs=1M
dd: writing `file': No space left on device
1070704+0 records in
1070703+0 records out
1122713473024 bytes (1.1 TB) copied, 2565.89 seconds, 438 MB/s

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Raid over 48 disks

2007-12-18 Thread Guy Watkins
} -Original Message-
} From: [EMAIL PROTECTED] [mailto:linux-raid-
} [EMAIL PROTECTED] On Behalf Of Brendan Conoboy
} Sent: Tuesday, December 18, 2007 3:36 PM
} To: Norman Elton
} Cc: linux-raid@vger.kernel.org
} Subject: Re: Raid over 48 disks
} 
} Norman Elton wrote:
}  We're investigating the possibility of running Linux (RHEL) on top of
}  Sun's X4500 Thumper box:
} 
}  http://www.sun.com/servers/x64/x4500/
} 
} Neat- 6 8 port SATA controllers!  It'll be worth checking to be sure
} each controller has equal bandwidth.  If some controllers are on slower
} buses than others you may want to consider that and balance the md
} device layout.

Assuming the 6 controllers are equal, I would make 3 16 disk RAID6 arrays
using 2 disks from each controller.  That way any 1 controller can fail and
your system will still be running.  6 disks will be used for redundancy.

Or 6 8 disk RAID6 arrays using 1 disk from each controller).  That way any 2
controllers can fail and your system will still be running.  12 disks will
be used for redundancy.  Might be too excessive!

Combine them into a RAID0 array.

Guy

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Raid over 48 disks

2007-12-18 Thread Justin Piszcz



On Tue, 18 Dec 2007, Guy Watkins wrote:


} -Original Message-
} From: [EMAIL PROTECTED] [mailto:linux-raid-
} [EMAIL PROTECTED] On Behalf Of Brendan Conoboy
} Sent: Tuesday, December 18, 2007 3:36 PM
} To: Norman Elton
} Cc: linux-raid@vger.kernel.org
} Subject: Re: Raid over 48 disks
}
} Norman Elton wrote:
}  We're investigating the possibility of running Linux (RHEL) on top of
}  Sun's X4500 Thumper box:
} 
}  http://www.sun.com/servers/x64/x4500/
}
} Neat- 6 8 port SATA controllers!  It'll be worth checking to be sure
} each controller has equal bandwidth.  If some controllers are on slower
} buses than others you may want to consider that and balance the md
} device layout.

Assuming the 6 controllers are equal, I would make 3 16 disk RAID6 arrays
using 2 disks from each controller.  That way any 1 controller can fail and
your system will still be running.  6 disks will be used for redundancy.

Or 6 8 disk RAID6 arrays using 1 disk from each controller).  That way any 2
controllers can fail and your system will still be running.  12 disks will
be used for redundancy.  Might be too excessive!

Combine them into a RAID0 array.

Guy

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



I'd be curious what the maximum aggregate bandwidth would be with RAID 0 
of 48 disks on that controller..

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Raid over 48 disks

2007-12-18 Thread Justin Piszcz



On Tue, 18 Dec 2007, Justin Piszcz wrote:




On Tue, 18 Dec 2007, Guy Watkins wrote:


} -Original Message-
} From: [EMAIL PROTECTED] [mailto:linux-raid-
} [EMAIL PROTECTED] On Behalf Of Brendan Conoboy
} Sent: Tuesday, December 18, 2007 3:36 PM
} To: Norman Elton
} Cc: linux-raid@vger.kernel.org
} Subject: Re: Raid over 48 disks
}
} Norman Elton wrote:
}  We're investigating the possibility of running Linux (RHEL) on top of
}  Sun's X4500 Thumper box:
} 
}  http://www.sun.com/servers/x64/x4500/
}
} Neat- 6 8 port SATA controllers!  It'll be worth checking to be sure
} each controller has equal bandwidth.  If some controllers are on slower
} buses than others you may want to consider that and balance the md
} device layout.

Assuming the 6 controllers are equal, I would make 3 16 disk RAID6 arrays
using 2 disks from each controller.  That way any 1 controller can fail and
your system will still be running.  6 disks will be used for redundancy.

Or 6 8 disk RAID6 arrays using 1 disk from each controller).  That way any 
2

controllers can fail and your system will still be running.  12 disks will
be used for redundancy.  Might be too excessive!

Combine them into a RAID0 array.

Guy

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



I'd be curious what the maximum aggregate bandwidth would be with RAID 0 of 
48 disks on that controller..




A RAID 0 over all of the controllers rather, if possible..


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html