hive is one approach (similar to routine databases but exactly not the same)
if you are looking at mapreduce program then using multipleinput formats http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.html On Tue, May 29, 2012 at 4:02 PM, Michel Segel <michael_se...@hotmail.com>wrote: > Hive? > Sure.... Assuming you mean that the id is a FK common amongst the tables... > > Sent from a remote device. Please excuse any typos... > > Mike Segel > > On May 29, 2012, at 5:29 AM, "liuzhg" <liu...@cernet.com> wrote: > > > Hi, > > > > I wonder that if Hadoop can solve effectively the question as following: > > > > ========================================== > > input file: a.txt, b.txt > > result: c.txt > > > > a.txt: > > id1,name1,age1,... > > id2,name2,age2,... > > id3,name3,age3,... > > id4,name4,age4,... > > > > b.txt: > > id1,address1,... > > id2,address2,... > > id3,address3,... > > > > c.txt > > id1,name1,age1,address1,... > > id2,name2,age2,address2,... > > ======================================== > > > > I know that it can be done well by database. > > But I want to handle it with hadoop if possible. > > Can hadoop meet the requirement? > > > > Any suggestion can help me. Thank you very much! > > > > Best Regards, > > > > Gump > > > > > > > -- Nitin Pawar