[ 
https://issues.apache.org/jira/browse/PIG-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498165#comment-13498165
 ] 

Koji Noguchi commented on PIG-3051:
-----------------------------------

Tracking the LogicalPlan change.

Original.
{noformat}
U1: (Name: LOStore Schema: sortCol#1871:int,label#1872:chararray,cnt#1870:long)
|
|---U1: (Name: LOForEach Schema: 
sortCol#1871:int,label#1872:chararray,cnt#1870:long)
    |   |
    |   (Name: LOGenerate[false,false,false] Schema: 
sortCol#1871:int,label#1872:chararray,cnt#1870:long)
    |   |   |
    |   |   (Name: Constant Type: int Uid: 1871)
    |   |   |
    |   |   (Name: Constant Type: chararray Uid: 1872)
    |   |   |
    |   |   cnt:(Name: Project Type: long Uid: 1870 Input: 0 Column: (*))
    |   |
    |   |---(Name: LOInnerLoad[2] Schema: cnt#1870:long)
    |   *****HERE*****
    |---ONEROW: (Name: LOLimit Schema: 
sortCol#1868:int,label#1869:chararray,cnt#1870:long)
        |
        |---G4: (Name: LOSplitOutput Schema: 
sortCol#1868:int,label#1869:chararray,cnt#1870:long)
            |   |
            |   (Name: Constant Type: boolean Uid: 1867)
            |
            |---G4: (Name: LOSplit Schema: 
sortCol#1864:int,label#1857:chararray,cnt#1865:long)
                |
                |---G4: (Name: LOSort Schema: 
sortCol#1864:int,label#1857:chararray,cnt#1865:long)
                    |   |
                    |   cnt:(Name: Project Type: long Uid: 1865 Input: 0 
Column: 2)
                    |
                    |---G3: (Name: LOForEach Schema: 
sortCol#1864:int,label#1857:chararray,cnt#1865:long)
                        |   |
{noformat}

After org.apache.pig.newplan.logical.rules.LimitOptimizer

{noformat}
U1: (Name: LOStore Schema: sortCol#1871:int,label#1872:chararray,cnt#1870:long)
|
|---U1: (Name: LOForEach Schema: 
sortCol#1871:int,label#1872:chararray,cnt#1870:long)
    |   |
    |   (Name: LOGenerate[false,false,false] Schema: 
sortCol#1871:int,label#1872:chararray,cnt#1870:long)
    |   |   |
    |   |   (Name: Constant Type: int Uid: 1871)
    |   |   |
    |   |   (Name: Constant Type: chararray Uid: 1872)
    |   |   |
    |   |   cnt:(Name: Project Type: long Uid: 1870 Input: 0 Column: (*))
    |   |
    |   |---(Name: LOInnerLoad[2] Schema: cnt#1870:long)
    |
    |---(Name: LOSort Schema: 
sortCol#1868:int,label#1869:chararray,cnt#1870:long)
        |   | *****HERE*****
        |   cnt:(Name: Project Type: long Uid: 1865 Input: 0 Column: 2)
        |
        |---G4: (Name: LOSplitOutput Schema: 
sortCol#1868:int,label#1869:chararray,cnt#1870:long)
            |   |
            |   (Name: Constant Type: boolean Uid: 1867)
            |
            |---G4: (Name: LOSplit Schema: 
sortCol#1864:int,label#1857:chararray,cnt#1865:long)
                |
                |---G4: (Name: LOSort Schema: 
sortCol#1864:int,label#1857:chararray,cnt#1865:long)
                    |   |
                    |   cnt:(Name: Project Type: long Uid: 1865 Input: 0 
Column: 2)
                    |
                    |---G3: (Name: LOForEach Schema: 
sortCol#1864:int,label#1857:chararray,cnt#1865:long)

{noformat}

After org.apache.pig.newplan.logical.rules.ColumnMapKeyPrune,
{noformat}
U1: (Name: LOStore Schema: sortCol#1871:int,label#1872:chararray,cnt#1870:long)
|
|---U1: (Name: LOForEach Schema: 
sortCol#1871:int,label#1872:chararray,cnt#1870:long)
    |   |
    |   (Name: LOGenerate[false,false,false] Schema: 
sortCol#1871:int,label#1872:chararray,cnt#1870:long)
    |   |   |
    |   |   (Name: Constant Type: int Uid: 1871)
    |   |   |
    |   |   (Name: Constant Type: chararray Uid: 1872)
    |   |   |
    |   |   cnt:(Name: Project Type: long Uid: 1870 Input: 0 Column: (*))
    |   |
    |   |---(Name: LOInnerLoad[2] Schema: cnt#1870:long)
    |
    |---(Name: LOSort Schema: 
sortCol#1868:int,label#1869:chararray,cnt#1870:long)
        |   | 
        |   cnt:(Name: Project Type: long Uid: 1865 Input: 0 Column: 2)
        |
        |---G4: (Name: LOSplitOutput Schema: 
sortCol#1868:int,label#1869:chararray,cnt#1870:long)
            |   |
            |   (Name: Constant Type: boolean Uid: 1867)
            |
            |---G4: (Name: LOSplit Schema: 
sortCol#1864:int,label#1857:chararray,cnt#1865:long)
                |
                |---G4: (Name: LOSort Schema: 
sortCol#1864:int,label#1857:chararray,cnt#1865:long)
                    |   |
                    |   cnt:(Name: Project Type: long Uid: 1865 Input: 0 
Column: 2)
                    |
                    |---G3: (Name: LOForEach Schema: 
sortCol#1864:int,label#1857:chararray,cnt#1865:long)
{noformat}
                
> java.lang.IndexOutOfBoundsException  failure with LimitOptimizer + 
> ColumnPruning
> --------------------------------------------------------------------------------
>
>                 Key: PIG-3051
>                 URL: https://issues.apache.org/jira/browse/PIG-3051
>             Project: Pig
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.10.0, 0.11
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>
> Had a user hitting 
> "Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1" error 
> when he had multiple stores and limit in his code.
> I couldn't reproduce this with short pig code (due to ColumnPruning somehow 
> not happening when shortened), but here's a snippet. 
> {noformat}
> ...
> G3 = FOREACH G2 GENERATE sortCol, FLATTEN(group) as label, (long)COUNT(G1) as 
> cnt;
> G4 = ORDER G3 BY cnt DESC PARALLEL 25;
> ONEROW = LIMIT G4 1;
> U1 = FOREACH ONEROW GENERATE 3 as sortcol, 'somelabel' as label, cnt;
> store U1 into 'u1' using PigStorage();
> store G4 into 'g4' using PigStorage();
> {noformat}
> With '-t ColumnMapKeyPrune', job didn't hit the error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to