Hi all, We are using CDH3B4 on the Hadoop Cluster.
We have hourly jobs kicking off every hour using the streaming API, each one of these jobs used to take 4/5 mins to complete but since 1pm yesterday all of a sudden started taking 3/4 hours. We looked at the data the jobs are working on and the data is exactly the same as it always has been. The cluster / config has not been touched since the upgrade to CDH3B4 which was one month ago. No errors are being reported in any of the logs, the jobs are just taking longer, much longer. One thing I have noticed in the logs, when the jobs just sit there in the middle of a job I do see one consistent entry in the slave log files: 2011-04-28 11:16:07,849 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s] 2011-04-28 11:16:07,849 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=10/0/0 in:NA [rec/s] out:NA [rec/s] I see that entry in Map phases and Reduce phases, when the jobs just sit idle for many tens of mins not doing anything. This happens even if there is nothing else running on the cluster. If anyone can shed some light on this or give me a direction to look into further then it would be much appreciated. Thank you. Regards, Abhinay Mehta
