Maybe you should upgrade to the current trunk or wait for 0.12.1 to get released (won't be long). There are major fixes in the trunk and I suspect you may be hitting the bug (Hadoop-1060).
> -----Original Message----- > From: Ion Badita [mailto:[EMAIL PROTECTED] > Sent: Thursday, March 15, 2007 11:12 AM > To: [email protected] > Subject: Stalled M/R task > > Hi, > > My configuration: hadoop 0.12.0, 19 computers, jdk 6. > > I ran a m/r task with 133 maps and one reduce. One of the nodes was > reported as "Lost task tracker" and the task stall at 99% on map and 32% > on reduce. It stayed like this for hours with no activity. The tasks > from the lost task tracker was not moved to another TT. > In the console from where i start the task i saw this stack trace: > > > 07/03/14 21:33:30 INFO mapred.JobClient: map 98% reduce 22% > 07/03/14 21:34:20 INFO mapred.JobClient: map 98% reduce 23% > 07/03/14 21:34:30 INFO mapred.JobClient: map 98% reduce 24% > 07/03/14 21:35:13 INFO mapred.JobClient: Task Id : task_0007_m_000034_0, > Status : FAILED > 07/03/14 21:35:13 INFO mapred.JobClient: Communication problem with > server: java.net.MalformedURLException: no protocol: null&filter=stdout > at java.net.URL.<init>(URL.java:567) > at java.net.URL.<init>(URL.java:464) > at java.net.URL.<init>(URL.java:413) > at > org.apache.hadoop.mapred.JobClient.displayTaskLogs(JobClient.java:621) > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:583) > at com.genzen.crawler.utils.ToolBase.runJob(ToolBase.java:103) > at com.genzen.crawler.indexer.Indexer.doMain(Indexer.java:87) > at com.genzen.crawler.utils.ToolBase.doMain0(ToolBase.java:96) > at com.genzen.crawler.utils.ToolBase.execute(ToolBase.java:117) > at com.genzen.crawler.indexer.Indexer.main(Indexer.java:92) > > 07/03/14 21:35:21 INFO mapred.JobClient: map 97% reduce 24% > 07/03/14 21:35:23 INFO mapred.JobClient: map 98% reduce 24% > 07/03/14 21:35:47 INFO mapred.JobClient: map 99% reduce 24% > 07/03/14 21:37:50 INFO mapred.JobClient: map 99% reduce 25% > 07/03/14 21:38:20 INFO mapred.JobClient: map 99% reduce 26% > 07/03/14 21:38:40 INFO mapred.JobClient: map 99% reduce 27% > 07/03/14 21:39:11 INFO mapred.JobClient: map 99% reduce 28% > 07/03/14 21:39:41 INFO mapred.JobClient: map 99% reduce 29% > 07/03/14 21:40:11 INFO mapred.JobClient: map 99% reduce 30% > 07/03/14 21:40:30 INFO mapred.JobClient: map 99% reduce 31% > 07/03/14 21:41:11 INFO mapred.JobClient: map 99% reduce 32% > > > I had task trackers crash in the past with the same configuration. Some > of them got rescheduled on different machines, other don't and because > of that the hole m/r never recovered from this. > > This is a "simulation" of an real environment, where computers crash. > > Any help will be appreciated. > > > John >
