parititioning dataset

Denim Live Sat, 03 Jul 2010 09:30:27 -0700

Hello everyone,

I have written my custom partitioner for partitioning datasets. I want to 
partition two datasets using the same partitioner and then in the next 
mapreduce job, I want each mapper to handle the same partition from the two 
sources and perform some function such as joining etc. How I can I ensure that 
one mapper gets the split that corresponds to same partition from both the 
sources?


Any help would be highly appreciated.
Alex

parititioning dataset

Reply via email to