Distinct followed by a Join results in Invalid size 0 for a tuple error
-----------------------------------------------------------------------

                 Key: PIG-558
                 URL: https://issues.apache.org/jira/browse/PIG-558
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: types_branch
            Reporter: Viraj Bhat
             Fix For: types_branch


The following Pig script does a right outer join after the DISTINCT.
{code}
nonuniqtable1 = LOAD 'table1' AS (f1:chararray);
table1 = DISTINCT nonuniqtable1;
table2 = LOAD 'table2' AS (f1:chararray, f2:int);
temp = COGROUP table1 BY f1 INNER, table2 BY f1;
DESCRIBE temp;
explain temp;
dump temp;
{code}
========================================================================================================
It results in the following error. This is true for other join types as well.
========================================================================================================
java.io.IOException: Invalid size 0 for a tuple
        at 
org.apache.pig.data.DataReaderWriter.readDatum(DataReaderWriter.java:57)
        at 
org.apache.pig.data.DataReaderWriter.readDatum(DataReaderWriter.java:62)
        at org.apache.pig.builtin.BinStorage.getNext(BinStorage.java:90)
        at 
org.apache.pig.backend.executionengine.PigSlice.next(PigSlice.java:103)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:157)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SliceWrapper$1.next(SliceWrapper.java:133)
        at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:165)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:45)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
        at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
========================================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to