Hello Sandy - Your partitioner isn't using any information from the key/value pair - it's only using the value T which is read once from the job configuration. getPartition() will always return the same value, so all of your data is being sent to one reducer. :P
cheers, -James On Fri, Jan 30, 2009 at 1:32 PM, Sandy <snickerdoodl...@gmail.com> wrote: > Hello, > > Could someone point me toward some more documentation on how to write one's > own partition class? I have having quite a bit of trouble getting mine to > work. So far, it looks something like this: > > public class myPartitioner extends MapReduceBase implements > Partitioner<IntWritable, IntWritable> { > > private int T; > > public void configure(JobConf job) { > super.configure(job); > String myT = job.get("tval"); //this is user defined > T = Integer.parseInt(myT); > } > > public int getPartition(IntWritable key, IntWritable value, int > numReduceTasks) { > int newT = (T/numReduceTasks); > int id = ((value.get()/ T); > return (int)(id/newT); > } > } > > In the run() function of my M/R program I just set it using: > > conf.setPartitionerClass(myPartitioner.class); > > Is there anything else I need to set in the run() function? > > > The code compiles fine. When I run it, I know it is "using" the > partitioner, > since I get different output than if I just let it use HashPartitioner. > However, it is not splitting between the reducers at all! If I set the > number of reducers to 2, all the output shows up in part-00000, while > part-00001 has nothing. > > I am having trouble debugging this since I don't know how I can observe the > values of numReduceTasks (which I assume is being set by the system). Is > this a proper assumption? > > If I try to insert any println() statements in the function, it isn't > outputted to either my terminal or my log files. Could someone give me some > general advice on how best to debug pieces of code like this? >