Re: Scaling hadoop up

Michael Bieniosek Thu, 29 Mar 2007 13:02:33 -0800

I've seen this with 0.12.1.

Currently I'm just running the jobtracker and namenode on one machine, with
tasktrackers & datanodes on all the others (no secondarynamenode).  It seems
like it might help to put the jobtracker and namenode on different machines;
is there anything else I could try?


-Michael

On 3/29/07 1:37 PM, "Doug Cutting" <[EMAIL PROTECTED]> wrote:

> Michael Bieniosek wrote:
>> When I try to scale Hadoop up to about 100 nodes on EC2 (single-cpu Xen), I
>> notice things start to fall apart.  For example, the jobtracker starts
>> dropping requests with the message "Call queue overflow discarding oldest
>> call".  I've also seen problems with the namenode where dfs requests fail
>> with EOFExceptions.
> 
> What version of Hadoop are you seeing this with?  Scalability has been
> improving.
> 
>> I've tried increasing the heartbeat value for the dfs (it's not configurable
>> for the jobtracker though).  Is there some other trick to make hadoop scale
>> a little further?  The website claims that Hadoop has scaled to 600 nodes,
>> but it seems like I would need a very powerful machine for the namenode and
>> jobtracker to do this.  Am I missing something?
> 
> Yahoo! does use dual-processor nodes that are more powerful than EC2's
> virtual nodes, but probably not 6x more powerful.
> 
> Doug

Re: Scaling hadoop up

Reply via email to