Hi,

Is this the speed you are observing when initial writes of the files
happening (i.e while you are initially putting 10gb files with replication)

Regards,
Raja Nagendra Kumar


Gyuribácsi wrote:
> 
>  
> Hi,
> 
> I have a 10 node cluster (IBM blade servers, 48GB RAM, 2x500GB Disk, 16 HT
> cores).
> 
> I've uploaded 10 files to HDFS. Each file is 10GB. I used the streaming
> jar
> with 'wc -l' as mapper and 'cat' as reducer.
> 
> I use 64MB block size and the default replication (3).
> 
> The wc on the 100 GB took about 220 seconds which translates to about 3.5
> Gbit/sec processing speed. One disk can do sequential read with 1Gbit/sec
> so
> i would expect someting around 20 GBit/sec (minus some overhead), and I'm
> getting only 3.5.
> 
> Is my expectaion valid?
> 
> I checked the jobtracked and it seems all nodes are working, each reading
> the right blocks. I have not played with the number of mapper and reducers
> yet. It seems the number of mappers is the same as the number of blocks
> and
> the number of reducers is 20 (there are 20 disks). This looks ok for me.
> 
> We also did an experiment with TestDFSIO with similar results. Aggregated
> read io speed is around 3.5Gbit/sec. It is just too far from my
> expectation:( 
> 
> Please help!
> 
> Thank you,
> Gyorgy
> 

-- 
View this message in context: 
http://old.nabble.com/Poor-IO-performance-on-a-10-node-cluster.-tp31732971p32076106.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Reply via email to