DataNode Hardware

2012-07-12 Thread Bartosz M. Frak
Quick question about data node hadrware. I've read a few articles, which cover the basics, including the Cloudera's recommendations here: http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/ The article is from early 2010, but I'm assuming that

Re: DataNode Hardware

2012-07-12 Thread Amandeep Khurana
Inline. On Thursday, July 12, 2012 at 12:56 PM, Bartosz M. Frak wrote: Quick question about data node hadrware. I've read a few articles, which cover the basics, including the Cloudera's recommendations here:

Re: DataNode Hardware

2012-07-12 Thread Bartosz M. Frak
Amandeep Khurana wrote: Inline. On Thursday, July 12, 2012 at 12:56 PM, Bartosz M. Frak wrote: Quick question about data node hadrware. I've read a few articles, which cover the basics, including the Cloudera's recommendations here:

Re: DataNode Hardware

2012-07-12 Thread Amandeep Khurana
The issue with having lower cores per box is that you are collocating datanode, region servers, task trackers and then the MR tasks themselves too. Plus you need a core for the OS too. These are things that need to run on a single node, so you need a minimum amount of resources that can handle

Re: DataNode Hardware

2012-07-12 Thread Michael Segel
Uhm... I'd take a step back... Thanks for the reply. I didn't realized that all the non-MR tasks were this CPU bound; plus my naive assumption was that four spindles will have a hard time supplying data to MR fast enough for it to become bogged down. Your gut feel is correct. If you go w