hardware is similar top that discussed here:
http://wiki.apache.org/lucene-hadoop-data/attachments/
HadoopPresentations/attachments/oscon-part-2.pdf
- 10:1 oversubscribed network (so 100mBit bandwidth all nodes to all
nodes)
- 40 nodes / leaf switch
- Machines are beefy
- 4SATA drives, 500 or 750 GB each, 7200 RPM
- 4+ cores (modern Intels or AMDs)
- 4+ GB RAM
On Sep 5, 2007, at 10:19 AM, Joydeep Sen Sarma wrote:
It will be very useful to see the hadoop/job config settings and get
some sense of the underlying hardware config.
-----Original Message-----
From: Devaraj Das [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 05, 2007 2:29 AM
To: [email protected]
Subject: Sort benchmark on 2000 nodes
This is FYI. We at Yahoo! could successfully run hadoop (upto date
trunk
version) on a cluster of 2000 nodes. The programs we ran were
RandomWriter
and Sort. Sort performance was pretty good - we could sort 20TB of
data
in
2.5 hours! Not many task failures - most of those that failed
encountered
file checksum errors during merge and map output serving, some got
killed
due to lack of progress reporting. Overall, a pretty successful run.