I cannot think of a way without writing UDF. You can write two UDF: * GetKey, input a map, output the key of the map * GetValues, input a bag of map, output a bag of map values
The script is like: b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1) as m; c = foreach b generate GetKey(m) as key, m; d = group c by key; e = foreach c generate group, SUM(GetValues(c.m)); Daniel On 05/23/2011 07:06 AM, Jameson Li wrote:
Hi all, I have the below pig code: register /home/uu/project/lib/pigudfs.jar ruls = load 'testurl' as (url:chararray); b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1); here when dump b, it will return: ([4#0.1677963]) ([193#0.16985779,81#0.10994483]) ([418#0.14138427,9#0.1107544,282#0.18699136]) I just want group by the map key and sum the map value just like: c = group b by $0#key; d = foreach c generate group,SUM(b.$0#value); How could I write the code? Thanks, Jameson Li.
