hive is one approach (similar to routine databases but exactly not the same)

if you are looking at mapreduce program then using multipleinput formats
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.html



On Tue, May 29, 2012 at 4:02 PM, Michel Segel <michael_se...@hotmail.com>wrote:

> Hive?
> Sure.... Assuming you mean that the id is a FK common amongst the tables...
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On May 29, 2012, at 5:29 AM, "liuzhg" <liu...@cernet.com> wrote:
>
> > Hi,
> >
> > I wonder that if Hadoop can solve effectively the question as following:
> >
> > ==========================================
> > input file: a.txt, b.txt
> > result: c.txt
> >
> > a.txt:
> > id1,name1,age1,...
> > id2,name2,age2,...
> > id3,name3,age3,...
> > id4,name4,age4,...
> >
> > b.txt:
> > id1,address1,...
> > id2,address2,...
> > id3,address3,...
> >
> > c.txt
> > id1,name1,age1,address1,...
> > id2,name2,age2,address2,...
> > ========================================
> >
> > I know that it can be done well by database.
> > But I want to handle it with hadoop if possible.
> > Can hadoop meet the requirement?
> >
> > Any suggestion can help me. Thank you very much!
> >
> > Best Regards,
> >
> > Gump
> >
> >
> >
>



-- 
Nitin Pawar

Reply via email to