[
https://issues.apache.org/jira/browse/HADOOP-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12469625
]
dhruba borthakur commented on HADOOP-885:
-----------------------------------------
The default settings for ipc.client.connection.maxidletime is 1 second.
Heartbeats are 3 seconds. This must be the reason why we see a large number of
accept(0 and close() calls. I propose that we change the default value of
ipc.client.connection.maxidletime from 1 to 4 seconds.
> Reduce CPU usage on namenode: gettimeofday
> ------------------------------------------
>
> Key: HADOOP-885
> URL: https://issues.apache.org/jira/browse/HADOOP-885
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.10.1
> Reporter: dhruba borthakur
> Assigned To: dhruba borthakur
> Attachments: WallClock.java
>
>
> On a 900 node idle cluster, the namenode spends about 20% of CPU. Most of
> this CPU is spent processing pure heartbeats. No jobs are running on this
> cluster and all nodes are alive and acting well.
> Of the total namenode CPU usage, about 12% is in usermode and about 70% is in
> kernel mode! The question that natually arises is why is heartbeat processing
> taking so much time in kernel mode?
> An strace of namenode reveals that a 20 second period has about 52000
> syscalls with the following breakup:
> gettimeofday : 18000 calls
> accept : 2655 calls
> close : 2655 calls
> shutdown : 2655 calls
> fcntl : 7965 calls
> read : 7965 calls
> futex : 5295 calls
> poll : 4894 calls
> A code inspection reveals that the code is doing multiple (about 5) calls to
> System.currentTimeMillis() in processing a single request in the RPC.java and
> Server.java classes. This might mean that there is a possibility of
> optimization.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.