IsolationRunner does not work as documented
-------------------------------------------
Key: HADOOP-4041
URL: https://issues.apache.org/jira/browse/HADOOP-4041
Project: Hadoop Core
Issue Type: Bug
Components: documentation, mapred
Affects Versions: 0.18.0
Reporter: Yuri Pradkin
IsolationRunner does not work as documented in the tutorial.
The tutorial says "To use the IsolationRunner, first set
keep.failed.tasks.files to true (also see keep.tasks.files.pattern)."
Should be:
keep.failed.task.files (not tasks)
After the above was set (quoted from my message on hadoop-core):
> After the task
> hung, I failed it via the web interface. Then I went to the node that was
> running this task
>
> $ cd ...local/taskTracker/jobcache/job_200808071645_0001/work
> (this path is already different from the tutorial's)
>
> $ hadoop org.apache.hadoop.mapred.IsolationRunner ../job.xml
> Exception in thread "main" java.lang.NullPointerException
> at
> org.apache.hadoop.mapred.IsolationRunner.main(IsolationRunner.java:164)
>
> Looking at IsolationRunner code, I see this:
>
> 164 File workDirName = new File(lDirAlloc.getLocalPathToRead(
> 165 TaskTracker.getJobCacheSubdir()
> 166 + Path.SEPARATOR +
> taskId.getJobID()
> 167 + Path.SEPARATOR + taskId
> 168 + Path.SEPARATOR + "work",
> 169 conf). toString());
>
> I.e. it assumes there is supposed to be a taskID subdirectory under the job
> dir, but:
> $ pwd
> ...mapred/local/taskTracker/jobcache/job_200808071645_0001
> $ ls
> jars job.xml work
>
> -- it's not there.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.