[
https://issues.apache.org/jira/browse/HADOOP-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467889
]
Owen O'Malley commented on HADOOP-933:
--------------------------------------
For a fix that doesn't depend on HADOOP-867, I would propose:
1. Replace the InputSplit in MapTask with:
private byte[] inputSplit;
private String inputSplitClassname;
2. The MapTask constructor still takes a InputSplit and serializes it to set
inputSplit and inputSplitClassname.
3. The MapTask.run method uses the bytes and classname to reconstruct the
InputSplit as a local variable in run.
4. I don't think the change to set map.input.* properties to default values
in the non-FileSplit case is reasonable.
> Application defined InputSplits do not work
> -------------------------------------------
>
> Key: HADOOP-933
> URL: https://issues.apache.org/jira/browse/HADOOP-933
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.10.1
> Reporter: Benjamin Reed
> Fix For: 0.10.1
>
> Attachments: MapTask.patch
>
>
> If an application defines its own InputSplit, the task tracker chokes when it
> cannot deserialize the InputSplit when it deserializes MapTasks it receives
> from the JobTracker. This is because the TaskTracker does not resolve classes
> from the job jar file. The attached patch delays resolution of the InputSplit
> until it is running in the context of the child process where it can resolve
> the InputSplit class.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.