One way to achieve this would be to: 1. Emit the same value multiple times, each time with a different key. 2. Use these different keys, in conjunction with the partitioner, to achieve the desired distribution.
Hope that helps! Karthik On Thu, Jul 5, 2012 at 12:19 AM, 静行 <xiaoyong.den...@taobao.com> wrote: > I have different key values to join two tables, but only a few key > values have large data to join and cost the most time, so I want to > distribute these key values to every reduce to join**** > > ** ** > > *发件人:* Devaraj k [mailto:devara...@huawei.com] > *发送时间:* 2012年7月5日 14:06 > *收件人:* mapreduce-user@hadoop.apache.org > *主题:* RE: How To Distribute One Map Data To All Reduce Tasks?**** > > ** ** > > Can you explain your usecase with some more details?**** > > **** > > Thanks**** > > Devaraj**** > ------------------------------ > > *From:* 静行 [xiaoyong.den...@taobao.com] > *Sent:* Thursday, July 05, 2012 9:53 AM > *To:* mapreduce-user@hadoop.apache.org > *Subject:* 答复: How To Distribute One Map Data To All Reduce Tasks?**** > > Thanks!**** > > But what I really want to know is how can I distribute one map data to > every reduce task, not one of reduce tasks.**** > > Do you have some ideas?**** > > **** > > *发件人:* Devaraj k [mailto:devara...@huawei.com] > *发送时间:* 2012年7月5日 12:12 > *收件人:* mapreduce-user@hadoop.apache.org > *主题:* RE: How To Distribute One Map Data To All Reduce Tasks?**** > > **** > > You can distribute the map data to the reduce tasks using Partitioner. By > default Job uses the HashPartitioner. You can use custom Partitioner it > according to your need.**** > > **** > > > http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Partitioner.html > **** > > **** > > Thanks**** > > Devaraj**** > ------------------------------ > > *From:* 静行 [xiaoyong.den...@taobao.com] > *Sent:* Thursday, July 05, 2012 9:00 AM > *To:* mapreduce-user@hadoop.apache.org > *Subject:* How To Distribute One Map Data To All Reduce Tasks?**** > > Hi all:**** > > How can I distribute one map data to all reduce tasks?**** > > **** > ------------------------------ > > > This email (including any attachments) is confidential and may be legally > privileged. If you received this email in error, please delete it > immediately and do not copy it or use it for any purpose or disclose its > contents to any other person. Thank you. > > 本电邮(包括任何附件) > 可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。**** >