DiskChecker$DiskErrorException
------------------------------
Key: HADOOP-4148
URL: https://issues.apache.org/jira/browse/HADOOP-4148
Project: Hadoop Core
Issue Type: Bug
Affects Versions: 0.17.2
Environment: 2 systems
1- redhat
1- ubuntu
Reporter: chandravadana
Priority: Blocker
Fix For: 0.17.2
hi
1- redhat - master( jobtracker + namenode+ tasktracker + datanode)
1- ubuntu - slave ( tasktracker + datanode)
when i execute
bin/hadoop jar word/word.jar org.myorg.WordCount in mn2
08/09/10 15:12:56 INFO mapred.FileInputFormat: Total input paths to process : 5
08/09/10 15:12:56 INFO mapred.JobClient: Running job: job_200809101511_0003
08/09/10 15:12:57 INFO mapred.JobClient: map 0% reduce 0%
08/09/10 15:13:00 INFO mapred.JobClient: map 20% reduce 0%
08/09/10 15:13:01 INFO mapred.JobClient: map 80% reduce 0%
08/09/10 15:13:02 INFO mapred.JobClient: map 100% reduce 0%
08/09/10 15:13:11 INFO mapred.JobClient: map 100% reduce 13%
08/09/10 15:30:41 INFO mapred.JobClient: map 80% reduce 13%
08/09/10 15:30:41 INFO mapred.JobClient: Task Id :
task_200809101511_0003_m_000000_0, Status : FAILED
Too many fetch-failures
08/09/10 15:30:42 WARN mapred.JobClient: Error reading task
outputhttp://localhost:50060/tasklog?plaintext=true&taskid=task_200809101511_0003_m_000000_0&filter=stdout
08/09/10 15:30:42 WARN mapred.JobClient: Error reading task
outputhttp://localhost:50060/tasklog?plaintext=true&taskid=task_200809101511_0003_m_000000_0&filter=stderr
08/09/10 15:30:44 INFO mapred.JobClient: map 100% reduce 13%
08/09/10 15:30:49 INFO mapred.JobClient: map 100% reduce 20%
08/09/10 15:40:52 INFO mapred.JobClient: Task Id :
task_200809101511_0003_m_000004_0, Status : FAILED
Too many fetch-failures
08/09/10 15:40:52 WARN mapred.JobClient: Error reading task
outputhttp://localhost:50060/tasklog?plaintext=true&taskid=task_200809101511_0003_m_000004_0&filter=stdout
08/09/10 15:40:52 WARN mapred.JobClient: Error reading task
outputhttp://localhost:50060/tasklog?plaintext=true&taskid=task_200809101511_0003_m_000004_0&filter=stderr
08/09/10 15:41:03 INFO mapred.JobClient: map 100% reduce 26%
it halts
when i saw the tasktracker's log, i found
getMapOutput(task_200809101511_0003_m_000004_0,0) failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
taskTracker/jobcache/job_200809101511_0003/task_200809101511_0003_m_000004_0/output/file.out.index
in any of the configured local directories
at
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359)
at
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
at
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2315)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
at
org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
at
org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
at org.mortbay.http.HttpServer.service(HttpServer.java:954)
at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
at
org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
2008-09-10 15:33:12,915 WARN org.apache.hadoop.mapred.TaskTracker: Unknown
child with bad map output: task_200809101511_0003_m_000004_0. Ignored.
2008-09-10 15:33:17,425 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:33:23,431 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:33:29,437 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:33:32,439 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:33:38,445 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:33:44,451 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:33:47,454 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:33:53,460 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:33:59,465 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:34:02,469 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:34:08,475 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:34:14,480 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:34:17,484 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:34:23,490 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:34:29,495 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
2008-09-10 15:34:32,498 INFO org.apache.hadoop.mapred.TaskTracker:
task_200809101511_0003_r_000000_0 0.20000002% reduce > copy (3 of 5 at 0.00
MB/s) >
reducer task runs on master(redhat)
the task_200809101511_0003_m_000004_0/ specified in the log was done in
slave(ubuntu)
in jobtracker's log, i found
2008-09-10 15:35:46,977 INFO org.apache.hadoop.mapred.JobInProgress: Failed
fetch notification #2 for task task_200809101511_0003_m_000004_0
hadoop-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310/</value>
<final>true</final>
</property>
<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
<final>true</final>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>absolute path</value>
<final>true</final>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx512M</value>
<final>true</final>
</property>
<property>
<name>mapred.speculative.execution</name>
<value>false</value>
<final>true</final>
</property>
</configuration>
i dont know where i went wrong ..
kindly help me solving this
thanks
Chandravadana
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.