[ https://issues.apache.org/jira/browse/DATAFU-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthew Hayes closed DATAFU-41. ------------------------------- Resolution: Won't Fix Closing this as it is quite old and there have been no updates. > BagGroup does not name bag field in some cases > ---------------------------------------------- > > Key: DATAFU-41 > URL: https://issues.apache.org/jira/browse/DATAFU-41 > Project: DataFu > Issue Type: Bug > Reporter: Matthew Hayes > Priority: Major > > For this test: > {code} > /** > define BagSum datafu.pig.bags.BagSum(); > define BagGroup datafu.pig.bags.BagGroup(); > > data = LOAD 'input' USING PigStorage(',') AS (id:int, key:chararray, > val:int); > describe data; > > data2 = GROUP data BY id; > > describe data2; > > data3 = FOREACH data2 GENERATE group as id, BagGroup(data,data.key) as > grouped; > > describe data3; > > data4 = FOREACH data3 { > summed = FOREACH grouped GENERATE group as key, SUM($1.val) as total; > ordered = ORDER summed BY key; > GENERATE id, ordered; > } > > describe data4; > > STORE data4 INTO 'output'; > */ > @Multiline > private String bagSumTest; > > @Test > public void bagSumTest() throws Exception > { > PigTest test = createPigTestFromString(bagSumTest); > writeLinesToFile("input", "1,A,1","1,B,2","2,A,3","3,A,4","1,C,5","1,C,6", > "3,A,7","2,B,8","1,A,9","2,A,10"); > test.runScript(); > assertOutput(test, "data4", > "(1,{(A,10),(B,2),(C,11)})", > "(2,{(A,13),(B,8)})", > "(3,{(A,11)})"); > } > {code} > {{data3}} is described as: > {code} > data3: {id: int,grouped: {(group: chararray,data: {(id: int,key: > chararray,val: int)})}} > {code} > However, if we change {{data}} to {{data.(key,val)}} then {{data3}} is > described as: > {code} > data3: {id: int,grouped: {(group: chararray,{(key: chararray,val: int)})}} > {code} > Note that there is no name, so you have to reference it by {{$1}}. There is > a separate issues, DATAFU-40, where even when it has the name {{data}} you > can run into problems later. -- This message was sent by Atlassian Jira (v8.3.4#803005)