It will be very useful to see the hadoop/job config settings and get some sense of the underlying hardware config.
-----Original Message----- From: Devaraj Das [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 05, 2007 2:29 AM To: [email protected] Subject: Sort benchmark on 2000 nodes This is FYI. We at Yahoo! could successfully run hadoop (upto date trunk version) on a cluster of 2000 nodes. The programs we ran were RandomWriter and Sort. Sort performance was pretty good - we could sort 20TB of data in 2.5 hours! Not many task failures - most of those that failed encountered file checksum errors during merge and map output serving, some got killed due to lack of progress reporting. Overall, a pretty successful run.
