[ http://issues.apache.org/jira/browse/HADOOP-424?page=all ]
Frédéric Bertin updated HADOOP-424:
-----------------------------------
Attachment: hadoop-424-2.patch
sorry, the patch was badly generated, this one seems better.
> mapreduce jobs fail when no split is returned via inputFormat.getSplits
> -----------------------------------------------------------------------
>
> Key: HADOOP-424
> URL: http://issues.apache.org/jira/browse/HADOOP-424
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.5.0
> Reporter: Frédéric Bertin
> Attachments: emptyJobTest.patch, hadoop-424-2.patch, hadoop-424.patch
>
>
> I'm using a MapReduce job to process some data logged and timestamped into
> files.
> When the job runs, it does not process the whole data, but filters only the
> data that has been logged since the last job run.
> However, when no new data has been logged, the job fails because the
> getSplits method of InputFormat returns no split. Thus the number of map
> tasks is 0. This is not intercepted, and the job fails at reduce step because
> it seems it does not find any data to process:
> java.io.FileNotFoundException:
> /local/home/hadoop/var/mapred/local/task_0030_r_000000_3/all.2 at
> org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:121) at
> org.apache.hadoop.fs.FSDataInputStream$Checker.(FSDataInputStream.java:47) at
> org.apache.hadoop.fs.FSDataInputStream.(FSDataInputStream.java:221) at
> org.apache.hadoop.fs.FileSystem.open(FileSystem.java:150) at
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:259) at
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:253) at
> org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:241) at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1013)
> What should be Hadoop's behaviour in such a case?
> IMHO, the job should be considered as successful. Indeed, this is not a job
> failure, but just a lack of input data. WDYT?
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira