CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Shiyuan Xiao
Hi We have written a MapReduce application based on Hadoop 2.4 which keeps reading data from HDFS(Pseudo-distributed mode in one node). And we found the CPU system time and user time of the application keeps increasing when it is running. If we changed the application to read data from local

Re: CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Stanley Shi
Would you please give the output of the top command? at least to show that the HDFS process did use that much of CPU; On Mon, Sep 1, 2014 at 2:19 PM, Shiyuan Xiao shiyuan.x...@ericsson.com wrote: Hi We have written a MapReduce application based on Hadoop 2.4 which keeps reading data from

RE: CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Shiyuan Xiao
...@pivotal.io] Sent: 2014年9月1日 14:32 To: user@hadoop.apache.org Subject: Re: CPU utilization keeps increasing when using HDFS Would you please give the output of the top command? at least to show that the HDFS process did use that much of CPU; On Mon, Sep 1, 2014 at 2:19 PM, Shiyuan Xiao shiyuan.x

Re: CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Gordon Wang
keeps increasing when using HDFS Would you please give the output of the top command? at least to show that the HDFS process did use that much of CPU; On Mon, Sep 1, 2014 at 2:19 PM, Shiyuan Xiao shiyuan.x...@ericsson.com wrote: Hi We have written a MapReduce application based

RE: CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Shiyuan Xiao
@hadoop.apache.org Subject: Re: CPU utilization keeps increasing when using HDFS Because you are using one node Pseudo cluster. When HDFS client write data to HDFS, client will compute the data chunk checksum and the datanode will verify it. It costs cpu shares. You can monitoring the cpu usages