[jira] Commented: (PIG-1068) COGROUP fails with 'Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableTuple'
[ https://issues.apache.org/jira/browse/PIG-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785029#action_12785029 ] Hadoop QA commented on PIG-1068: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426691/PIG-1068.patch against trunk revision 886015. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/76/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/76/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/76/console This message is automatically generated. > COGROUP fails with 'Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableText, recieved > org.apache.pig.impl.io.NullableTuple' > --- > > Key: PIG-1068 > URL: https://issues.apache.org/jira/browse/PIG-1068 > Project: Pig > Issue Type: Bug >Affects Versions: 0.4.0 >Reporter: Vikram Oberoi >Assignee: Richard Ding > Fix For: 0.6.0 > > Attachments: cogroup-bug.pig, log, PIG-1068.patch > > > The COGROUP in the following script fails in its map: > {code} > logs = LOAD '$LOGS' USING PigStorage() AS (ts:int, id:chararray, > command:chararray, comments:chararray); > > > > > SPLIT logs INTO logins IF command == 'login', all_quits IF command == 'quit'; > > > > > > -- Project login clients and count them by ID. > > > login_info = FOREACH logins { > > > GENERATE id as id, > > > comments AS client; > > > }; > > > > > > logins_grouped = GROUP login_info BY (id, client); > > > > > > count_logins_by_client = FOREACH logins_grouped { > > > generate group.id AS id, group.
[jira] Commented: (PIG-1068) COGROUP fails with 'Type mismatch in key from map: expected org.apache.pig.impl.io.NullableText, recieved org.apache.pig.impl.io.NullableTuple'
[ https://issues.apache.org/jira/browse/PIG-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783978#action_12783978 ] Richard Ding commented on PIG-1068: --- The cause of this bug is this: On the one hand, the value (as in key/value pairs) received by a reducer may not be the complete "value", It may have portions in the key. In this case, the real "value" is stitched together by the packager. On the other hand, MultiQuery optimizer merges the jobs with different map key types by wrapping the keys in tuples (so that the resulting job has tuple as common map key type). Unfortunately, the unwrapping the key happens in the demuxer (after packager) and the "stitched up" value isn't the expected value. The solution will be to move the Multiquery unwrapping logic from demuxer to packager. > COGROUP fails with 'Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableText, recieved > org.apache.pig.impl.io.NullableTuple' > --- > > Key: PIG-1068 > URL: https://issues.apache.org/jira/browse/PIG-1068 > Project: Pig > Issue Type: Bug >Affects Versions: 0.4.0 >Reporter: Vikram Oberoi >Assignee: Richard Ding > Fix For: 0.6.0 > > Attachments: cogroup-bug.pig, log > > > The COGROUP in the following script fails in its map: > {code} > logs = LOAD '$LOGS' USING PigStorage() AS (ts:int, id:chararray, > command:chararray, comments:chararray); > > > > > SPLIT logs INTO logins IF command == 'login', all_quits IF command == 'quit'; > > > > > > -- Project login clients and count them by ID. > > > login_info = FOREACH logins { > > > GENERATE id as id, > > > comments AS client; > > > }; > > > > > > logins_grouped = GROUP login_info BY (id, client); > > > > > > count_logins_by_client = FOREACH logins_grouped { > > > generate group.id AS id, group.client AS client, COUNT($1) AS count; > > > } > > > > >