Re: Sort benchmark on 2000 nodes

Enis Soztutar Wed, 05 Sep 2007 06:02:03 -0700

I am wondering how hadoop scores on sorting 1TB with say 1000 nodes. Isit possible for you to try the Terasort benchmark?


Devaraj Das wrote:

This is FYI. We at Yahoo! could successfully run hadoop (upto date trunk
version) on a cluster of 2000 nodes. The programs we ran were RandomWriter
and Sort. Sort performance was pretty good - we could sort 20TB of data in
2.5 hours! Not many task failures - most of those that failed encountered
file checksum errors during merge and map output serving, some got killed
due to lack of progress reporting. Overall, a pretty successful run.

Re: Sort benchmark on 2000 nodes

Reply via email to