Task Tracker burns a lot of cpu in calling getLocalCache ---------------------------------------------------------
Key: HADOOP-4780 URL: https://issues.apache.org/jira/browse/HADOOP-4780 Project: Hadoop Core Issue Type: Bug Components: mapred Reporter: Runping Qi I noticed that many times, a task tracker max up to 6 cpus. During that time, iostat shows majority of that was system cpu. That situation can last for quite long. During that time, I saw a number of threads were in the following state: java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.getBooleanAttributes0(Native Method) at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228) at java.io.File.exists(File.java:733) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:399) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407) at org.apache.hadoop.filecache.DistributedCache.getLocalCache(DistributedCache.java:176) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:140) I suspect that getLocalCache is too expensive. And calling it for every task initialization seems too much waste. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.