[
https://issues.apache.org/jira/browse/MAPREDUCE-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250600#comment-13250600
]
Harsh J commented on MAPREDUCE-4130:
------------------------------------
Also related is MAPREDUCE-2384 (test at least can be reused), and perhaps
MAPREDUCE-1588 if it still applies to limits available in Yarn's mrapp.
The splits are currently computed and written in one call, and hence they are
to be done after a staging directory is available (with a job ID on its path).
I've not seen too many failures in practice at the input split computation
level, but the idea of separating these two calls seems alright to me.
> Jobid creation is not required if the job failed because of unavailability of
> input path.Can input and output paths validation can be done before job ID
> creation step?
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4130
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4130
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Reporter: Ramgopal N
> Priority: Minor
>
> If the input splits cannot be computed because the input paths doesnt exist
> Job is not submitted and error is thrown.But before that jobid is created
> which can be avoided.
> The sequence is
> 1)Output path validation
> 2)Jobid creation
> 3)Input splits computation
> But if the sequence of steps is 1,3,2 ...unnecessary jobid creation can be
> avoided
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira