Re: Is it always called part-00000?

Jeff Zhang Sun, 17 Jan 2010 18:16:11 -0800

Hi Mark,

1. If you use the old API, the ouput file is named part-00000, and if you
use the new API, the output file will be part-r-00000, and there will be
usually more than 1 output files, the output file number is determined by
the reducer number of your map-reduce job.


2. If you'd like to consume the output of the first job, you just need to
set the output folder of the first job as the input of second job



On Mon, Jan 18, 2010 at 9:11 AM, Mark Kerzner <[email protected]> wrote:

> Hi,
>
> I am writing a second step to run after my first Hadoop job step finished.
> It is to pick up the results of the previous step and to do further
> processing on it. Therefore, I have two questions please.
>
>   1. Is the output file always called  part-00000?
>   2. Am I perhaps better off reading all files in the output directory and
>   how do I do it?
>
> Thank you,
> Mark
>
> PS. Thank you guys for answering my questions - that's a tremendous help
> and
> a great resource.
>
> Mark
>



-- 
Best Regards

Jeff Zhang

Re: Is it always called part-00000?

Reply via email to