On 01/17/2010 05:11 PM, Mark Kerzner wrote:
Hi,
I am writing a second step to run after my first Hadoop job step finished.
It is to pick up the results of the previous step and to do further
processing on it. Therefore, I have two questions please.
1. Is the output file always called part-00000?
That is getting too much into the details of hadoop. Probably could be
taken as a last resort.
2. Am I perhaps better off reading all files in the output directory and
how do I do it?
Does cascading ( cascading.org ), providing framework for workflow
management, solve what you were looking at ?
Thank you,
Mark
PS. Thank you guys for answering my questions - that's a tremendous help and
a great resource.
Mark