Quick question about data node hadrware. I've read a few articles, which
cover the basics, including the Cloudera's recommendations here:
http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/
The article is from early 2010, but I'm assuming that
Inline.
On Thursday, July 12, 2012 at 12:56 PM, Bartosz M. Frak wrote:
Quick question about data node hadrware. I've read a few articles, which
cover the basics, including the Cloudera's recommendations here:
Amandeep Khurana wrote:
Inline.
On Thursday, July 12, 2012 at 12:56 PM, Bartosz M. Frak wrote:
Quick question about data node hadrware. I've read a few articles, which
cover the basics, including the Cloudera's recommendations here:
The issue with having lower cores per box is that you are collocating datanode,
region servers, task trackers and then the MR tasks themselves too. Plus you
need a core for the OS too. These are things that need to run on a single node,
so you need a minimum amount of resources that can handle
Uhm... I'd take a step back...
Thanks for the reply. I didn't realized that all the non-MR tasks were this
CPU bound; plus my naive assumption was that four spindles will have a hard
time supplying data to MR fast enough for it to become bogged down.
Your gut feel is correct.
If you go w