In some of our pipelines, pig jobs are part of the pipeline - which consist of other hadoop jobs, shell executions, etc.
We currently do this by using intermediate file dumps. Regards, Mridul On Friday 23 July 2010 10:45 PM, Corbin Hoenes wrote:
What are some strategies to have pig and java mapreduce jobs exchange data? E.g. we find a particular pig script in a chain is too slow and we could optimize with a custom mapreduce job we'd want pig to write the data out in a format that mapreduce could access and vice versa.
