My understanding is a StoreFunc in Pig has similar role as OutputFormat in
purely java mapreduce jobs.

On Wed, Mar 27, 2013 at 4:41 PM, Jonathan Coveney <[email protected]>wrote:

> as far as when the storefunc works, it depends on whether the job is map
> only or map/reduce. It'll work on the last phase. Generally this is the
> reduce phase.
>
> As far as how pig knows where to send it's output, there are keys in pig.
> Basically, a reduce job is necessary any time you have a group, join, or
> sort. In the case of a group or join, the key is the group key and the join
> key, respectively. In the case of a sort it is more complicated.
>
>
> 2013/3/27 Mark <[email protected]>
>
> > I understand in the traditional map/reduce paradigm that each key will
> get
> > sent to the same reducer sorted but in pig there is no such thing as a
> > "key".  I'm curious to know how pig knows to which reducer to send its
> > output to?
> >
> > So when creating a custom StoreFunc is there any guarentee on the
> ordering
> > of Tuples that come into putNext?
> >
> > And another even more basic question. Do StoreFuncs operate at the Map
> > phase or Reduce phase?
> >
> > Thanks
> >
> >
> >
>

Reply via email to