[ http://issues.apache.org/jira/browse/HADOOP-424?page=comments#action_12426736 ] Doug Cutting commented on HADOOP-424: -------------------------------------
I agree that this should not fail. Can you please modify one of the mini-mr unit tests to test for this case and submit that as a patch? > mapreduce jobs fail when no split is returned via inputFormat.getSplits > ----------------------------------------------------------------------- > > Key: HADOOP-424 > URL: http://issues.apache.org/jira/browse/HADOOP-424 > Project: Hadoop > Issue Type: Bug > Components: mapred > Affects Versions: 0.4.0 > Reporter: Frédéric Bertin > > I'm using a MapReduce job to process some data logged and timestamped into > files. > When the job runs, it does not process the whole data, but filters only the > data that has been logged since the last job run. > However, when no new data has been logged, the job fails because the > getSplits method of InputFormat returns no split. Thus the number of map > tasks is 0. This is not intercepted, and the job fails at reduce step because > it seems it does not find any data to process: > java.io.FileNotFoundException: > /local/home/hadoop/var/mapred/local/task_0030_r_000000_3/all.2 at > org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:121) at > org.apache.hadoop.fs.FSDataInputStream$Checker.(FSDataInputStream.java:47) at > org.apache.hadoop.fs.FSDataInputStream.(FSDataInputStream.java:221) at > org.apache.hadoop.fs.FileSystem.open(FileSystem.java:150) at > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:259) at > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:253) at > org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:241) at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1013) > What should be Hadoop's behaviour in such a case? > IMHO, the job should be considered as successful. Indeed, this is not a job > failure, but just a lack of input data. WDYT? -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
