Hi Abhinay, If you have access to the compute nodes, then
1) jstack of streaming mapper jvm 2) strace -f of streaming mapper jvm 3) strace -f of streaming map process itself might help. Koji On 4/28/11 3:33 AM, "Abhinay Mehta" <[email protected]> wrote: > Hi all, > > We are using CDH3B4 on the Hadoop Cluster. > > We have hourly jobs kicking off every hour using the streaming API, > each one of these jobs used to take 4/5 mins to complete but since 1pm > yesterday all of a sudden started taking 3/4 hours. > > We looked at the data the jobs are working on and the data is exactly the > same as it always has been. > The cluster / config has not been touched since the upgrade to CDH3B4 which > was one month ago. > > No errors are being reported in any of the logs, the jobs are just taking > longer, much longer. > One thing I have noticed in the logs, when the jobs just sit there in the > middle of a job I do see one consistent entry in the slave log files: > > 2011-04-28 11:16:07,849 INFO org.apache.hadoop.streaming.PipeMapRed: > R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s] > 2011-04-28 11:16:07,849 INFO org.apache.hadoop.streaming.PipeMapRed: > R/W/S=10/0/0 in:NA [rec/s] out:NA [rec/s] > > I see that entry in Map phases and Reduce phases, when the jobs just sit > idle for many tens of mins not doing anything. > This happens even if there is nothing else running on the cluster. > > If anyone can shed some light on this or give me a direction to look into > further then it would be much appreciated. > > Thank you. > > Regards, > Abhinay Mehta
