Hello Sandy -
Your partitioner isn't using any information from the key/value pair - it's
only using the value T which is read once from the job configuration.
 getPartition() will always return the same value, so all of your data is
being sent to one reducer. :P

cheers,
-James

On Fri, Jan 30, 2009 at 1:32 PM, Sandy <snickerdoodl...@gmail.com> wrote:

> Hello,
>
> Could someone point me toward some more documentation on how to write one's
> own partition class? I have having quite a bit of trouble getting mine to
> work. So far, it looks something like this:
>
> public class myPartitioner extends MapReduceBase implements
> Partitioner<IntWritable, IntWritable> {
>
>    private int T;
>
>    public void configure(JobConf job) {
>    super.configure(job);
>    String myT = job.get("tval");        //this is user defined
>    T = Integer.parseInt(myT);
>    }
>
>    public int getPartition(IntWritable key, IntWritable value, int
> numReduceTasks) {
>        int newT = (T/numReduceTasks);
>        int id = ((value.get()/ T);
>        return (int)(id/newT);
>    }
> }
>
> In the run() function of my M/R program I just set it using:
>
> conf.setPartitionerClass(myPartitioner.class);
>
> Is there anything else I need to set in the run() function?
>
>
> The code compiles fine. When I run it, I know it is "using" the
> partitioner,
> since I get different output than if I just let it use HashPartitioner.
> However, it is not splitting between the reducers at all! If I set the
> number of reducers to 2, all the output shows up in part-00000, while
> part-00001 has nothing.
>
> I am having trouble debugging this since I don't know how I can observe the
> values of numReduceTasks (which I assume is being set by the system). Is
> this a proper assumption?
>
> If I try to insert any println() statements in the function, it isn't
> outputted to either my terminal or my log files. Could someone give me some
> general advice on how best to debug pieces of code like this?
>

Reply via email to