[ https://issues.apache.org/jira/browse/PIG-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Richard Ding updated PIG-1068: ------------------------------ Status: Patch Available (was: Open) > COGROUP fails with 'Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableText, recieved > org.apache.pig.impl.io.NullableTuple' > ----------------------------------------------------------------------------------------------------------------------------------------------- > > Key: PIG-1068 > URL: https://issues.apache.org/jira/browse/PIG-1068 > Project: Pig > Issue Type: Bug > Affects Versions: 0.4.0 > Reporter: Vikram Oberoi > Assignee: Richard Ding > Fix For: 0.6.0 > > Attachments: cogroup-bug.pig, log, PIG-1068.patch > > > The COGROUP in the following script fails in its map: > {code} > logs = LOAD '$LOGS' USING PigStorage() AS (ts:int, id:chararray, > command:chararray, comments:chararray); > > > > > SPLIT logs INTO logins IF command == 'login', all_quits IF command == 'quit'; > > > > > > -- Project login clients and count them by ID. > > > login_info = FOREACH logins { > > > GENERATE id as id, > > > comments AS client; > > > }; > > > > > > logins_grouped = GROUP login_info BY (id, client); > > > > > > count_logins_by_client = FOREACH logins_grouped { > > > generate group.id AS id, group.client AS client, COUNT($1) AS count; > > > } > > > > > > -- Get the first quit. > > > all_quits_grouped = GROUP all_quits BY id; > > > > > > quits = FOREACH all_quits_grouped { > > > ordered = ORDER all_quits BY ts ASC; > > > last_quit = LIMIT ordered 1; > > > GENERATE FLATTEN(last_quit); > > > } > > > > > > -- Now, group all the info together. > > > joined_session_info = COGROUP quits BY id, count_logins_by_client BY id; > > > > > > DUMP joined_session_info; > {code} > Here's the stack trace: > {code} > java.io.IOException: Type mismatch in key from map: expected > org.apache.pig.impl.io.NullableText, recieved > org.apache.pig.impl.io.NullableTuple > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:415) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:229) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:157) > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.