Take a look at the PARALLEL clause:

http://pig.apache.org/docs/r0.7.0/cookbook.html#Use+the+PARALLEL+Clause

On Fri, May 17, 2013 at 10:48 AM, Vincent Barat <vincent.ba...@gmail.com>wrote:

> Hi,
>
> I use this request to remove duplicated entries from a set of input files
> (I cannot use DISTINCT since some fields can be different)
>
> grp = GROUP alias BY key;
> alias = FOREACH grp {
>   record = LIMIT  alias 1;
>   GENERATE FLATTEN(record) AS ... :
> }
>
> It appears that this request always generates 1 reducer (I use 0 as
> default nb of reducer to let PIG decide) whatever the size of my input data.
>
> Is it a normal behavior ? How can I improve my request time by using
> several reducers ?
>
> Thanks a lot for your help.
>
>
>

Reply via email to