Re: How to mapreduce in the scenario

Robert Evans Tue, 29 May 2012 12:34:07 -0700

Yes you can do it.  In pig you would write something like

A = load ‘a.txt’ as (id, name, age, ...)
B = load ‘b.txt’ as (id, address, ...)
C = JOIN A BY id, B BY id;
STORE C into ‘c.txt’


Hive can do it similarly too.  Or you could write your own directly in 
map/redcue or using the data_join jar.

--Bobby Evans

On 5/29/12 4:08 AM, "lzg" <lzg_...@163.com> wrote:

Hi,

I wonder that if Hadoop can solve effectively the question as following:

==========================================
input file: a.txt, b.txt
result: c.txt

a.txt:
id1,name1,age1,...
id2,name2,age2,...
id3,name3,age3,...
id4,name4,age4,...

b.txt：
id1,address1,...
id2,address2,...
id3,address3,...

c.txt
id1,name1,age1,address1,...
id2,name2,age2,address2,...
========================================

I know that it can be done well by database.
But I want to handle it with hadoop if possible.
Can hadoop meet the requirement?

Any suggestion can help me. Thank you very much!

Best Regards,

Gump

Re: How to mapreduce in the scenario

Reply via email to