Hi, Is this the speed you are observing when initial writes of the files happening (i.e while you are initially putting 10gb files with replication)
Regards, Raja Nagendra Kumar Gyuribácsi wrote: > > > Hi, > > I have a 10 node cluster (IBM blade servers, 48GB RAM, 2x500GB Disk, 16 HT > cores). > > I've uploaded 10 files to HDFS. Each file is 10GB. I used the streaming > jar > with 'wc -l' as mapper and 'cat' as reducer. > > I use 64MB block size and the default replication (3). > > The wc on the 100 GB took about 220 seconds which translates to about 3.5 > Gbit/sec processing speed. One disk can do sequential read with 1Gbit/sec > so > i would expect someting around 20 GBit/sec (minus some overhead), and I'm > getting only 3.5. > > Is my expectaion valid? > > I checked the jobtracked and it seems all nodes are working, each reading > the right blocks. I have not played with the number of mapper and reducers > yet. It seems the number of mappers is the same as the number of blocks > and > the number of reducers is 20 (there are 20 disks). This looks ok for me. > > We also did an experiment with TestDFSIO with similar results. Aggregated > read io speed is around 3.5Gbit/sec. It is just too far from my > expectation:( > > Please help! > > Thank you, > Gyorgy > -- View this message in context: http://old.nabble.com/Poor-IO-performance-on-a-10-node-cluster.-tp31732971p32076106.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
