Re: Question about RAID controllers and hadoop

2011-08-14 Thread Charles Wimmer
[cwimmer@hostname bonnie++-1.03e]$ ./bonnie++ -d . -s 5 -m P410 -f
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.03e   --Sequential Output-- --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
P410 5M   68392  12 21153   3   116423   4 216.8   0
--Sequential Create-- Random Create
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
 16 22238  33 + +++ + +++ + +++ + +++ + +++


P410,5M,,,68392,12,21153,3,,,116423,4,216.8,0,16,22238,33,+,+++,+,+++,+,+++,+,+++,+,+++



On 8/11/11 5:15 PM, Mohit Anchlia mohitanch...@gmail.com wrote:

On Thu, Aug 11, 2011 at 3:26 PM, Charles Wimmer cwim...@yahoo-inc.com wrote:
 We currently use P410s in 12 disk system.  Each disk is set up as a RAID0 
 volume.  Performance is at least as good as a bare disk.

Can you please share what throughput you see with P410s? Are these SATA or SAS?



 On 8/11/11 3:23 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com 
 wrote:

 If I read that email chain correctly then they were referring to the classic 
 JBOD vs multiple disks striped together conversation. The conversation that 
 was started here is referring to JBOD vs 1 RAID 0 per disk and the effects of 
 the raid controller on those independent raids.

 Matt

 -Original Message-
 From: Kai Voigt [mailto:k...@123.org]
 Sent: Thursday, August 11, 2011 5:17 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop

 Yahoo did some testing 2 years ago: 
 http://markmail.org/message/xmzc45zi25htr7ry

 But updated benchmark would be interesting to see.

 Kai

 Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000):

 My assumption would be that having a set of 4 raid 0 disks would actually be 
 better than having a controller that allowed pure JBOD of 4 disks due to the 
 cache on the controller. If anyone has any personal experience with this I 
 would love to know performance numbers but our infrastructure guy is doing 
 tests on exactly this over the next couple days so I will pass it along once 
 we have it.

 Matt

 -Original Message-
 From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com]
 Sent: Thursday, August 11, 2011 5:00 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop

 True, you need a P410 controller. You can create RAID0 for each disk to make 
 it as JBOD.


 -Bharath



 
 From: Koert Kuipers ko...@tresata.com
 To: common-user@hadoop.apache.org
 Sent: Thursday, August 11, 2011 2:50 PM
 Subject: Question about RAID controllers and hadoop

 Hello all,
 We are considering using low end HP proliant machines (DL160s and DL180s)
 for cluster nodes. However with these machines if you want to do more than 4
 hard drives then HP puts in a P410 raid controller. We would configure the
 RAID controller to function as JBOD, by simply creating multiple RAID
 volumes with one disk. Does anyone have experience with this setup? Is it a
 good idea, or am i introducing a i/o bottleneck?
 Thanks for your help!
 Best, Koert
 This e-mail message may contain privileged and/or confidential information, 
 and is intended to be received only by persons entitled
 to receive such information. If you have received this e-mail in error, 
 please notify the sender immediately. Please delete it and
 all attachments from any servers, hard drives or any other media. Other use 
 of this e-mail by you is strictly prohibited.

 All e-mails and attachments sent and received are subject to monitoring, 
 reading and archival by Monsanto, including its
 subsidiaries. The recipient of this e-mail is solely responsible for 
 checking for the presence of Viruses or other Malware.
 Monsanto, along with its subsidiaries, accepts no liability for any damage 
 caused by any such code transmitted by or accompanying
 this e-mail or any attachment.


 The information contained in this email may be subject to the export control 
 laws and regulations of the United States, potentially
 including but not limited to the Export Administration Regulations (EAR) and 
 sanctions regulations issued by the U.S. Department of
 Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
 information you are obligated to comply with all
 applicable U.S. export laws and regulations.



 --
 Kai Voigt

Question about RAID controllers and hadoop

2011-08-11 Thread Koert Kuipers
Hello all,
We are considering using low end HP proliant machines (DL160s and DL180s)
for cluster nodes. However with these machines if you want to do more than 4
hard drives then HP puts in a P410 raid controller. We would configure the
RAID controller to function as JBOD, by simply creating multiple RAID
volumes with one disk. Does anyone have experience with this setup? Is it a
good idea, or am i introducing a i/o bottleneck?
Thanks for your help!
Best, Koert


Re: Question about RAID controllers and hadoop

2011-08-11 Thread Bharath Mundlapudi
True, you need a P410 controller. You can create RAID0 for each disk to make it 
as JBOD.


-Bharath




From: Koert Kuipers ko...@tresata.com
To: common-user@hadoop.apache.org
Sent: Thursday, August 11, 2011 2:50 PM
Subject: Question about RAID controllers and hadoop

Hello all,
We are considering using low end HP proliant machines (DL160s and DL180s)
for cluster nodes. However with these machines if you want to do more than 4
hard drives then HP puts in a P410 raid controller. We would configure the
RAID controller to function as JBOD, by simply creating multiple RAID
volumes with one disk. Does anyone have experience with this setup? Is it a
good idea, or am i introducing a i/o bottleneck?
Thanks for your help!
Best, Koert

RE: Question about RAID controllers and hadoop

2011-08-11 Thread GOEKE, MATTHEW (AG/1000)
My assumption would be that having a set of 4 raid 0 disks would actually be 
better than having a controller that allowed pure JBOD of 4 disks due to the 
cache on the controller. If anyone has any personal experience with this I 
would love to know performance numbers but our infrastructure guy is doing 
tests on exactly this over the next couple days so I will pass it along once we 
have it.

Matt

-Original Message-
From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com] 
Sent: Thursday, August 11, 2011 5:00 PM
To: common-user@hadoop.apache.org
Subject: Re: Question about RAID controllers and hadoop

True, you need a P410 controller. You can create RAID0 for each disk to make it 
as JBOD.


-Bharath




From: Koert Kuipers ko...@tresata.com
To: common-user@hadoop.apache.org
Sent: Thursday, August 11, 2011 2:50 PM
Subject: Question about RAID controllers and hadoop

Hello all,
We are considering using low end HP proliant machines (DL160s and DL180s)
for cluster nodes. However with these machines if you want to do more than 4
hard drives then HP puts in a P410 raid controller. We would configure the
RAID controller to function as JBOD, by simply creating multiple RAID
volumes with one disk. Does anyone have experience with this setup? Is it a
good idea, or am i introducing a i/o bottleneck?
Thanks for your help!
Best, Koert
This e-mail message may contain privileged and/or confidential information, and 
is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please 
notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of 
this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, 
reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking 
for the presence of Viruses or other Malware.
Monsanto, along with its subsidiaries, accepts no liability for any damage 
caused by any such code transmitted by or accompanying
this e-mail or any attachment.


The information contained in this email may be subject to the export control 
laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and 
sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
information you are obligated to comply with all
applicable U.S. export laws and regulations.



Re: Question about RAID controllers and hadoop

2011-08-11 Thread Kai Voigt
Yahoo did some testing 2 years ago: http://markmail.org/message/xmzc45zi25htr7ry

But updated benchmark would be interesting to see.

Kai

Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000):

 My assumption would be that having a set of 4 raid 0 disks would actually be 
 better than having a controller that allowed pure JBOD of 4 disks due to the 
 cache on the controller. If anyone has any personal experience with this I 
 would love to know performance numbers but our infrastructure guy is doing 
 tests on exactly this over the next couple days so I will pass it along once 
 we have it.
 
 Matt
 
 -Original Message-
 From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com] 
 Sent: Thursday, August 11, 2011 5:00 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop
 
 True, you need a P410 controller. You can create RAID0 for each disk to make 
 it as JBOD.
 
 
 -Bharath
 
 
 
 
 From: Koert Kuipers ko...@tresata.com
 To: common-user@hadoop.apache.org
 Sent: Thursday, August 11, 2011 2:50 PM
 Subject: Question about RAID controllers and hadoop
 
 Hello all,
 We are considering using low end HP proliant machines (DL160s and DL180s)
 for cluster nodes. However with these machines if you want to do more than 4
 hard drives then HP puts in a P410 raid controller. We would configure the
 RAID controller to function as JBOD, by simply creating multiple RAID
 volumes with one disk. Does anyone have experience with this setup? Is it a
 good idea, or am i introducing a i/o bottleneck?
 Thanks for your help!
 Best, Koert
 This e-mail message may contain privileged and/or confidential information, 
 and is intended to be received only by persons entitled
 to receive such information. If you have received this e-mail in error, 
 please notify the sender immediately. Please delete it and
 all attachments from any servers, hard drives or any other media. Other use 
 of this e-mail by you is strictly prohibited.
 
 All e-mails and attachments sent and received are subject to monitoring, 
 reading and archival by Monsanto, including its
 subsidiaries. The recipient of this e-mail is solely responsible for checking 
 for the presence of Viruses or other Malware.
 Monsanto, along with its subsidiaries, accepts no liability for any damage 
 caused by any such code transmitted by or accompanying
 this e-mail or any attachment.
 
 
 The information contained in this email may be subject to the export control 
 laws and regulations of the United States, potentially
 including but not limited to the Export Administration Regulations (EAR) and 
 sanctions regulations issued by the U.S. Department of
 Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
 information you are obligated to comply with all
 applicable U.S. export laws and regulations.
 
 

-- 
Kai Voigt
k...@123.org






Re: Question about RAID controllers and hadoop

2011-08-11 Thread Charles Wimmer
We currently use P410s in 12 disk system.  Each disk is set up as a RAID0 
volume.  Performance is at least as good as a bare disk.


On 8/11/11 3:23 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com 
wrote:

If I read that email chain correctly then they were referring to the classic 
JBOD vs multiple disks striped together conversation. The conversation that was 
started here is referring to JBOD vs 1 RAID 0 per disk and the effects of the 
raid controller on those independent raids.

Matt

-Original Message-
From: Kai Voigt [mailto:k...@123.org]
Sent: Thursday, August 11, 2011 5:17 PM
To: common-user@hadoop.apache.org
Subject: Re: Question about RAID controllers and hadoop

Yahoo did some testing 2 years ago: http://markmail.org/message/xmzc45zi25htr7ry

But updated benchmark would be interesting to see.

Kai

Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000):

 My assumption would be that having a set of 4 raid 0 disks would actually be 
 better than having a controller that allowed pure JBOD of 4 disks due to the 
 cache on the controller. If anyone has any personal experience with this I 
 would love to know performance numbers but our infrastructure guy is doing 
 tests on exactly this over the next couple days so I will pass it along once 
 we have it.

 Matt

 -Original Message-
 From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com]
 Sent: Thursday, August 11, 2011 5:00 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop

 True, you need a P410 controller. You can create RAID0 for each disk to make 
 it as JBOD.


 -Bharath



 
 From: Koert Kuipers ko...@tresata.com
 To: common-user@hadoop.apache.org
 Sent: Thursday, August 11, 2011 2:50 PM
 Subject: Question about RAID controllers and hadoop

 Hello all,
 We are considering using low end HP proliant machines (DL160s and DL180s)
 for cluster nodes. However with these machines if you want to do more than 4
 hard drives then HP puts in a P410 raid controller. We would configure the
 RAID controller to function as JBOD, by simply creating multiple RAID
 volumes with one disk. Does anyone have experience with this setup? Is it a
 good idea, or am i introducing a i/o bottleneck?
 Thanks for your help!
 Best, Koert
 This e-mail message may contain privileged and/or confidential information, 
 and is intended to be received only by persons entitled
 to receive such information. If you have received this e-mail in error, 
 please notify the sender immediately. Please delete it and
 all attachments from any servers, hard drives or any other media. Other use 
 of this e-mail by you is strictly prohibited.

 All e-mails and attachments sent and received are subject to monitoring, 
 reading and archival by Monsanto, including its
 subsidiaries. The recipient of this e-mail is solely responsible for checking 
 for the presence of Viruses or other Malware.
 Monsanto, along with its subsidiaries, accepts no liability for any damage 
 caused by any such code transmitted by or accompanying
 this e-mail or any attachment.


 The information contained in this email may be subject to the export control 
 laws and regulations of the United States, potentially
 including but not limited to the Export Administration Regulations (EAR) and 
 sanctions regulations issued by the U.S. Department of
 Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
 information you are obligated to comply with all
 applicable U.S. export laws and regulations.



--
Kai Voigt
k...@123.org







Re: Question about RAID controllers and hadoop

2011-08-11 Thread Koert Kuipers
Hey Charles, I was considering using 8 drives, each set as RAID0, so its
good to hear such a setup is working for you.
Best Koert

On Thu, Aug 11, 2011 at 6:26 PM, Charles Wimmer cwim...@yahoo-inc.comwrote:

 We currently use P410s in 12 disk system.  Each disk is set up as a RAID0
 volume.  Performance is at least as good as a bare disk.


 On 8/11/11 3:23 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com
 wrote:

 If I read that email chain correctly then they were referring to the
 classic JBOD vs multiple disks striped together conversation. The
 conversation that was started here is referring to JBOD vs 1 RAID 0 per disk
 and the effects of the raid controller on those independent raids.

 Matt

 -Original Message-
 From: Kai Voigt [mailto:k...@123.org]
 Sent: Thursday, August 11, 2011 5:17 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop

 Yahoo did some testing 2 years ago:
 http://markmail.org/message/xmzc45zi25htr7ry

 But updated benchmark would be interesting to see.

 Kai

 Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000):

  My assumption would be that having a set of 4 raid 0 disks would actually
 be better than having a controller that allowed pure JBOD of 4 disks due to
 the cache on the controller. If anyone has any personal experience with this
 I would love to know performance numbers but our infrastructure guy is doing
 tests on exactly this over the next couple days so I will pass it along once
 we have it.
 
  Matt
 
  -Original Message-
  From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com]
  Sent: Thursday, August 11, 2011 5:00 PM
  To: common-user@hadoop.apache.org
  Subject: Re: Question about RAID controllers and hadoop
 
  True, you need a P410 controller. You can create RAID0 for each disk to
 make it as JBOD.
 
 
  -Bharath
 
 
 
  
  From: Koert Kuipers ko...@tresata.com
  To: common-user@hadoop.apache.org
  Sent: Thursday, August 11, 2011 2:50 PM
  Subject: Question about RAID controllers and hadoop
 
  Hello all,
  We are considering using low end HP proliant machines (DL160s and DL180s)
  for cluster nodes. However with these machines if you want to do more
 than 4
  hard drives then HP puts in a P410 raid controller. We would configure
 the
  RAID controller to function as JBOD, by simply creating multiple RAID
  volumes with one disk. Does anyone have experience with this setup? Is it
 a
  good idea, or am i introducing a i/o bottleneck?
  Thanks for your help!
  Best, Koert
  This e-mail message may contain privileged and/or confidential
 information, and is intended to be received only by persons entitled
  to receive such information. If you have received this e-mail in error,
 please notify the sender immediately. Please delete it and
  all attachments from any servers, hard drives or any other media. Other
 use of this e-mail by you is strictly prohibited.
 
  All e-mails and attachments sent and received are subject to monitoring,
 reading and archival by Monsanto, including its
  subsidiaries. The recipient of this e-mail is solely responsible for
 checking for the presence of Viruses or other Malware.
  Monsanto, along with its subsidiaries, accepts no liability for any
 damage caused by any such code transmitted by or accompanying
  this e-mail or any attachment.
 
 
  The information contained in this email may be subject to the export
 control laws and regulations of the United States, potentially
  including but not limited to the Export Administration Regulations (EAR)
 and sanctions regulations issued by the U.S. Department of
  Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of
 this information you are obligated to comply with all
  applicable U.S. export laws and regulations.
 
 

 --
 Kai Voigt
 k...@123.org








Re: Question about RAID controllers and hadoop

2011-08-11 Thread Mohit Anchlia
On Thu, Aug 11, 2011 at 3:26 PM, Charles Wimmer cwim...@yahoo-inc.com wrote:
 We currently use P410s in 12 disk system.  Each disk is set up as a RAID0 
 volume.  Performance is at least as good as a bare disk.

Can you please share what throughput you see with P410s? Are these SATA or SAS?



 On 8/11/11 3:23 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com 
 wrote:

 If I read that email chain correctly then they were referring to the classic 
 JBOD vs multiple disks striped together conversation. The conversation that 
 was started here is referring to JBOD vs 1 RAID 0 per disk and the effects of 
 the raid controller on those independent raids.

 Matt

 -Original Message-
 From: Kai Voigt [mailto:k...@123.org]
 Sent: Thursday, August 11, 2011 5:17 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop

 Yahoo did some testing 2 years ago: 
 http://markmail.org/message/xmzc45zi25htr7ry

 But updated benchmark would be interesting to see.

 Kai

 Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000):

 My assumption would be that having a set of 4 raid 0 disks would actually be 
 better than having a controller that allowed pure JBOD of 4 disks due to the 
 cache on the controller. If anyone has any personal experience with this I 
 would love to know performance numbers but our infrastructure guy is doing 
 tests on exactly this over the next couple days so I will pass it along once 
 we have it.

 Matt

 -Original Message-
 From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com]
 Sent: Thursday, August 11, 2011 5:00 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop

 True, you need a P410 controller. You can create RAID0 for each disk to make 
 it as JBOD.


 -Bharath



 
 From: Koert Kuipers ko...@tresata.com
 To: common-user@hadoop.apache.org
 Sent: Thursday, August 11, 2011 2:50 PM
 Subject: Question about RAID controllers and hadoop

 Hello all,
 We are considering using low end HP proliant machines (DL160s and DL180s)
 for cluster nodes. However with these machines if you want to do more than 4
 hard drives then HP puts in a P410 raid controller. We would configure the
 RAID controller to function as JBOD, by simply creating multiple RAID
 volumes with one disk. Does anyone have experience with this setup? Is it a
 good idea, or am i introducing a i/o bottleneck?
 Thanks for your help!
 Best, Koert
 This e-mail message may contain privileged and/or confidential information, 
 and is intended to be received only by persons entitled
 to receive such information. If you have received this e-mail in error, 
 please notify the sender immediately. Please delete it and
 all attachments from any servers, hard drives or any other media. Other use 
 of this e-mail by you is strictly prohibited.

 All e-mails and attachments sent and received are subject to monitoring, 
 reading and archival by Monsanto, including its
 subsidiaries. The recipient of this e-mail is solely responsible for 
 checking for the presence of Viruses or other Malware.
 Monsanto, along with its subsidiaries, accepts no liability for any damage 
 caused by any such code transmitted by or accompanying
 this e-mail or any attachment.


 The information contained in this email may be subject to the export control 
 laws and regulations of the United States, potentially
 including but not limited to the Export Administration Regulations (EAR) and 
 sanctions regulations issued by the U.S. Department of
 Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
 information you are obligated to comply with all
 applicable U.S. export laws and regulations.



 --
 Kai Voigt
 k...@123.org