Hi, I'm trying to do updates of records in hadoop using Pig ( I know this is not ideal but trying out POC ).. data looks like the below:
*feed1:* --> here trade key is unique for each order/record --> this is history file trade-key trade-add-date trade-price *k1 05/21/2012 2000* k2 04/21/2012 3000 k3 03/21/2012 4000 k4 05/21/2012 5000 *feed2: *--> this is the latest/daily feed trade-key trade-add-date trade-price k5 06/22/2012 1000 k6 06/22/2012 2000 *k1 06/21/2012 3000 ---> we can see here, trade with key "k1" is appeared again..that means order with trade key "k1" has some update* * * Now I'm looking for the below output : ( merging the both files and and looking for common key from both feeds and keeping the latest key record in the output file ) *k1 06/21/2012 3000* * k2 04/21/2012 3000 k3 06/21/2012 4000 k4 07/21/2012 5000 *k5 06/22/2012 1000 k6 06/22/2012 2000* any help appreciated greatly !! * Regards, Srinivas
