This is FYI. We at Yahoo! could successfully run hadoop (upto date trunk version) on a cluster of 2000 nodes. The programs we ran were RandomWriter and Sort. Sort performance was pretty good - we could sort 20TB of data in 2.5 hours! Not many task failures - most of those that failed encountered file checksum errors during merge and map output serving, some got killed due to lack of progress reporting. Overall, a pretty successful run.
- Sort benchmark on 2000 nodes Devaraj Das
- Re: Sort benchmark on 2000 nodes Enis Soztutar
- RE: Sort benchmark on 2000 nodes Devaraj Das
- RE: Sort benchmark on 2000 nodes Joydeep Sen Sarma
- Re: Sort benchmark on 2000 nodes Eric Baldeschwieler
- RE: Sort benchmark on 2000 nodes Devaraj Das
