Stalled M/R task

Ion Badita Wed, 14 Mar 2007 21:43:02 -0800

Hi,

My configuration: hadoop 0.12.0, 19 computers, jdk 6.

I ran a m/r task with 133 maps and one reduce. One of the nodes wasreported as "Lost task tracker" and the task stall at 99% on map and 32%on reduce. It stayed like this for hours with no activity. The tasksfrom the lost task tracker was not moved to another TT.

In the console from where i start the task i saw this stack trace:


07/03/14 21:33:30 INFO mapred.JobClient:  map 98% reduce 22%
07/03/14 21:34:20 INFO mapred.JobClient:  map 98% reduce 23%
07/03/14 21:34:30 INFO mapred.JobClient:  map 98% reduce 24%

07/03/14 21:35:13 INFO mapred.JobClient: Task Id : task_0007_m_000034_0,Status : FAILED07/03/14 21:35:13 INFO mapred.JobClient: Communication problem withserver: java.net.MalformedURLException: no protocol: null&filter=stdout

       at java.net.URL.<init>(URL.java:567)
       at java.net.URL.<init>(URL.java:464)
       at java.net.URL.<init>(URL.java:413)

atorg.apache.hadoop.mapred.JobClient.displayTaskLogs(JobClient.java:621)

       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:583)
       at com.genzen.crawler.utils.ToolBase.runJob(ToolBase.java:103)
       at com.genzen.crawler.indexer.Indexer.doMain(Indexer.java:87)
       at com.genzen.crawler.utils.ToolBase.doMain0(ToolBase.java:96)
       at com.genzen.crawler.utils.ToolBase.execute(ToolBase.java:117)
       at com.genzen.crawler.indexer.Indexer.main(Indexer.java:92)

07/03/14 21:35:21 INFO mapred.JobClient:  map 97% reduce 24%
07/03/14 21:35:23 INFO mapred.JobClient:  map 98% reduce 24%
07/03/14 21:35:47 INFO mapred.JobClient:  map 99% reduce 24%
07/03/14 21:37:50 INFO mapred.JobClient:  map 99% reduce 25%
07/03/14 21:38:20 INFO mapred.JobClient:  map 99% reduce 26%
07/03/14 21:38:40 INFO mapred.JobClient:  map 99% reduce 27%
07/03/14 21:39:11 INFO mapred.JobClient:  map 99% reduce 28%
07/03/14 21:39:41 INFO mapred.JobClient:  map 99% reduce 29%
07/03/14 21:40:11 INFO mapred.JobClient:  map 99% reduce 30%
07/03/14 21:40:30 INFO mapred.JobClient:  map 99% reduce 31%
07/03/14 21:41:11 INFO mapred.JobClient:  map 99% reduce 32%

I had task trackers crash in the past with the same configuration. Someof them got rescheduled on different machines, other don't and becauseof that the hole m/r never recovered from this.


This is a "simulation" of an real environment, where computers crash.

Any help will be appreciated.


John

Stalled M/R task

Reply via email to