Hi, I solved it by creating a new JobConf instance for each iteration in the loop.
Thanks & regards Arko On Oct 12, 2011, at 1:54 AM, Arko Provo Mukherjee <arkoprovomukher...@gmail.com> wrote: > Hello Everyone, > > I have a particular situation, where I am trying to run Iterative Map-Reduce, > where the output files for one iteration are the input files for the next. > It stops when there are no new files created in the output. > > Code Snippet: > > int round = 0; > JobConf jobconf = new JobConf(new Configuration(), MyClass.class); > > do { > > String old_path = "path_" + Integer.toString(round); > > round = round + 1; > > String new_path = "path" + Integer.toString(round); > > FileInputFormat.addInputPath ( jobconf, new Path (old_file) ); > > FileInputFormat.setInputPath ( jobconf, new Path (new_file) ); // These > will eventually become directories containing multiple files > > jobconf.setMapperClass(MyMap.class); > > jobconf.setReducerClass(MyReduce.class); > > // Other code > > JobClient.runJob(jobconf); > > FileStatus[] directory = fs.listStatus ( new Path ( new_file ) ); // To > check for any new files in the output directory > > } while ( directory.length != 0 ); // Stop iteration only when no new files > are generated in the output path > > > > The code runs smoothly in the first round and I can see the new directory > path_1 getting created and files added in it from the Reducer output. > > The original path_0 is created from before by me and I have added relevant > files in it. > > The output files seems to have the correct data as per my Map/Reduce logic. > > However, in the second round it fails with the following exception. > > In 0.19 (In a cloud system - Fully Distributed Mode) > > java.lang.IllegalArgumentException: Wrong FS: > hdfs://cloud_hostname:9000/hadoop/tmp/hadoop/mapred/system/job_201106271322_9494/job.jar, > expected: file:/// > > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:322) > > > > In 0.20.203 (my own system and not a cloud - Pseudo Distributed Mode) > > 11/10/12 00:35:42 INFO mapred.JobClient: Cleaning up the staging area > hdfs://localhost:54310/hadoop-0.20.203.0/HDFS/mapred/staging/arko/.staging/job_201110120017_0002 > > Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: > hdfs://localhost:54310/hadoop-0.20.203.0/HDFS/mapred/staging/arko/.staging/job_201110120017_0001/job.jar, > expected: file:/// > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:354) > > It seems that Hadoop is not being able to delete the staging file for the job. > > Can you please suggest any reason for this? Please help! > > Thanks a lot in advance! > > Warm regards > Arko > > > > >