The PigWriter is only used if we are doing a map only job. In setupMapPipe the writer is only created if we aren't doing grouping. (If we aren't grouping, there will not be a reduce. No reduce -> no shuffle.) If we are grouping, we doing do a PigWriter and the output of map goes through the normal collector where it can be sorted/combined/shuffled/sorted/reduced.
ben On Thursday 03 April 2008 06:18:53 pi song wrote: > In PigMapReduce.run(RecordReader input, OutputCollector output, Reporter > reporter), as I can see, Pig does create its own OutputCollector and write > output to its own files (using PigWriter). > How does the shuffle process work if the files aren't created from the > outputCollector supplied in run(RecordReader input, OutputCollector output, > Reporter reporter)? Do we just put the output files to the location where > shuffle expects? > > Thanks for explanation in advance, > Pi
