as far as when the storefunc works, it depends on whether the job is map only or map/reduce. It'll work on the last phase. Generally this is the reduce phase.
As far as how pig knows where to send it's output, there are keys in pig. Basically, a reduce job is necessary any time you have a group, join, or sort. In the case of a group or join, the key is the group key and the join key, respectively. In the case of a sort it is more complicated. 2013/3/27 Mark <[email protected]> > I understand in the traditional map/reduce paradigm that each key will get > sent to the same reducer sorted but in pig there is no such thing as a > "key". I'm curious to know how pig knows to which reducer to send its > output to? > > So when creating a custom StoreFunc is there any guarentee on the ordering > of Tuples that come into putNext? > > And another even more basic question. Do StoreFuncs operate at the Map > phase or Reduce phase? > > Thanks > > >
