Attach my sample code ( this InputFormat generate 1 reducer task for each 5
mapper task):

*public class MyInputFormat extends TextInputFormat {

    @Override
    public InputSplit[] getSplits(JobConf job, int numSplits)
            throws IOException {
        InputSplit[] splits = super.getSplits(job, numSplits);
        int reducerNum = splits.length / 5;
        if (reducerNum == 0) {
            reducerNum = 1;
        }

        job.setNumReduceTasks(reducerNum);
        return splits;
    }
}*


After pig integrate the InputFormat in LoadFunc (Pig-966), it will be
possible to change the reducer task number dynamically.


Jeff Zhang


On Fri, Nov 27, 2009 at 3:38 PM, Jeff Zhang <[email protected]> wrote:

> I get the suggestion from Owen O'Malley that we can control reducer number
> in InputFormat, and I have tried that, it works.
>
>
> Jeff Zhang
>
>
>
>
> On Sat, Nov 14, 2009 at 1:23 AM, Alan Gates <[email protected]> wrote:
>
>>
>> On Nov 12, 2009, at 2:49 PM, Scott Carey wrote:
>>
>>  Is it possible to have a script at least use the default configured
>>> Hadoop value?  Or is there a way to do that already?
>>>
>>
>> If the user doesn't specify a parallelism Pig doesn't set a value in
>> JobConf for the reduce, which means it will pick up the default for the
>> cluster.  Unless cluster administrators change it, the default for the
>> cluster is 1.
>>
>>
>>>  Alan.
>>
>
>

Reply via email to