My understanding is a StoreFunc in Pig has similar role as OutputFormat in purely java mapreduce jobs.
On Wed, Mar 27, 2013 at 4:41 PM, Jonathan Coveney <[email protected]>wrote: > as far as when the storefunc works, it depends on whether the job is map > only or map/reduce. It'll work on the last phase. Generally this is the > reduce phase. > > As far as how pig knows where to send it's output, there are keys in pig. > Basically, a reduce job is necessary any time you have a group, join, or > sort. In the case of a group or join, the key is the group key and the join > key, respectively. In the case of a sort it is more complicated. > > > 2013/3/27 Mark <[email protected]> > > > I understand in the traditional map/reduce paradigm that each key will > get > > sent to the same reducer sorted but in pig there is no such thing as a > > "key". I'm curious to know how pig knows to which reducer to send its > > output to? > > > > So when creating a custom StoreFunc is there any guarentee on the > ordering > > of Tuples that come into putNext? > > > > And another even more basic question. Do StoreFuncs operate at the Map > > phase or Reduce phase? > > > > Thanks > > > > > > >
