[ https://issues.apache.org/jira/browse/PIG-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich updated PIG-558: ------------------------------- Resolution: Fixed Status: Resolved (was: Patch Available) patch committed; thanks, pradeep > Distinct followed by a Join results in Invalid size 0 for a tuple error > ----------------------------------------------------------------------- > > Key: PIG-558 > URL: https://issues.apache.org/jira/browse/PIG-558 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: types_branch > Reporter: Viraj Bhat > Assignee: Pradeep Kamath > Fix For: types_branch > > Attachments: PIG-558-additional.patch, PIG-558.patch, table1, table2 > > > The following Pig script does a right outer join after the DISTINCT. > {code} > nonuniqtable1 = LOAD 'table1' AS (f1:chararray); > table1 = DISTINCT nonuniqtable1; > table2 = LOAD 'table2' AS (f1:chararray, f2:int); > temp = COGROUP table1 BY f1 INNER, table2 BY f1; > DESCRIBE temp; > explain temp; > dump temp; > {code} > ======================================================================================================== > It results in the following error. This is true for other join types as well. > ======================================================================================================== > java.io.IOException: Invalid size 0 for a tuple > at > org.apache.pig.data.DataReaderWriter.readDatum(DataReaderWriter.java:57) > at > org.apache.pig.data.DataReaderWriter.readDatum(DataReaderWriter.java:62) > at org.apache.pig.builtin.BinStorage.getNext(BinStorage.java:90) > at > org.apache.pig.backend.executionengine.PigSlice.next(PigSlice.java:103) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:157) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:133) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:165) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:45) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) > ======================================================================================================== -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.