Hello,

I have been using Hadoop on a cluster with AMD Opteron Processor 2212 clocked at 2GMz and also a cluster with Atom N330 clocked at 1.6GHz. Both are dual cores. I always use HDFS for storing input data and output data and I observe high CPU consumption caused by HDFS in both clusters. In the AMD cluster, the bottleneck is the disk. I use TestDFSIO to test the performance. The writing throughput to HDFS is about 50MB/s when the replication factor is 1 and each node runs one mapper, but the CPU consumption is about 50% for DataNode and about 40% for the mapper of TestDFSIO. When I test the Atom cluster, the bottleneck is CPU. I used the same setting and I got the similar writing throughput, but the CPU consumption is close to 100% for DataNode and the mapper. Could anyone tell me what is the CPU usage in your cluster?

Thanks,
Da

Reply via email to