Hi,

I'm trying to do updates of records in hadoop using Pig ( I know this is
not ideal but trying out POC )..
data looks like the below:

*feed1:*
--> here trade key is unique for each order/record
--> this is history file

trade-key    trade-add-date       trade-price
*k1                 05/21/2012            2000*
k2                  04/21/2012             3000
k3                 03/21/2012            4000
k4                 05/21/2012             5000

*feed2:  *--> this is the latest/daily feed
trade-key    trade-add-date       trade-price
k5                06/22/2012             1000
k6                 06/22/2012            2000
*k1                06/21/2012             3000   ---> we can see here,
trade with key "k1" is appeared again..that means order with trade key "k1"
has some update*
*
*
Now I'm looking for the below output :  ( merging the both files and and
looking for common key from both feeds and keeping the latest key record in
the output file )
*k1                06/21/2012             3000*
*
k2                  04/21/2012             3000
k3                 06/21/2012            4000
k4                 07/21/2012             5000
*k5                06/22/2012             1000
k6                 06/22/2012            2000*

any help appreciated greatly !!
*

Regards,
Srinivas

Reply via email to