Michael Bieniosek wrote:
When I try to scale Hadoop up to about 100 nodes on EC2 (single-cpu Xen), I
notice things start to fall apart.  For example, the jobtracker starts
dropping requests with the message "Call queue overflow discarding oldest
call".  I've also seen problems with the namenode where dfs requests fail
with EOFExceptions.

What version of Hadoop are you seeing this with? Scalability has been improving.

I've tried increasing the heartbeat value for the dfs (it's not configurable
for the jobtracker though).  Is there some other trick to make hadoop scale
a little further?  The website claims that Hadoop has scaled to 600 nodes,
but it seems like I would need a very powerful machine for the namenode and
jobtracker to do this.  Am I missing something?

Yahoo! does use dual-processor nodes that are more powerful than EC2's virtual nodes, but probably not 6x more powerful.

Doug

Reply via email to