Hi Mark, 1. If you use the old API, the ouput file is named part-00000, and if you use the new API, the output file will be part-r-00000, and there will be usually more than 1 output files, the output file number is determined by the reducer number of your map-reduce job.
2. If you'd like to consume the output of the first job, you just need to set the output folder of the first job as the input of second job On Mon, Jan 18, 2010 at 9:11 AM, Mark Kerzner <[email protected]> wrote: > Hi, > > I am writing a second step to run after my first Hadoop job step finished. > It is to pick up the results of the previous step and to do further > processing on it. Therefore, I have two questions please. > > 1. Is the output file always called part-00000? > 2. Am I perhaps better off reading all files in the output directory and > how do I do it? > > Thank you, > Mark > > PS. Thank you guys for answering my questions - that's a tremendous help > and > a great resource. > > Mark > -- Best Regards Jeff Zhang
