Ljddfjfjfififfifjftjiiiiiifjfjjjffkxbznzsjxodiewisshsudddudsjidhddueiweefiuftttoitfiirriifoiffkllddiririiriioerorooiieirrioeekroooeoooirjjfdijdkkduddjudiiehs
On May 30, 2011 5:28 AM, "Gyuribácsi" <[email protected]> wrote:
>
>
> Hi,
>
> I have a 10 node cluster (IBM blade servers, 48GB RAM, 2x500GB Disk, 16 HT
> cores).
>
> I've uploaded 10 files to HDFS. Each file is 10GB. I used the streaming
jar
> with 'wc -l' as mapper and 'cat' as reducer.
>
> I use 64MB block size and the default replication (3).
>
> The wc on the 100 GB took about 220 seconds which translates to about 3.5
> Gbit/sec processing speed. One disk can do sequential read with 1Gbit/sec
so
> i would expect someting around 20 GBit/sec (minus some overhead), and I'm
> getting only 3.5.
>
> Is my expectaion valid?
>
> I checked the jobtracked and it seems all nodes are working, each reading
> the right blocks. I have not played with the number of mapper and reducers
> yet. It seems the number of mappers is the same as the number of blocks
and
> the number of reducers is 20 (there are 20 disks). This looks ok for me.
>
> We also did an experiment with TestDFSIO with similar results. Aggregated
> read io speed is around 3.5Gbit/sec. It is just too far from my
> expectation:(
>
> Please help!
>
> Thank you,
> Gyorgy
> --
> View this message in context:
http://old.nabble.com/Poor-IO-performance-on-a-10-node-cluster.-tp31732971p31732971.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>