[ 
https://issues.apache.org/jira/browse/PIG-3379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733919#comment-13733919
 ] 

Xuefu Zhang commented on PIG-3379:
----------------------------------

[~daijy] Thanks for your suggestion. While your patch does make "describe A" 
work, it generates the wrong result with the new test case in my patch. 
Further, the following is shown in the logical plan for "EventsPerMinute", in 
which we only have one "DistinctDevices" operator, which is incorrect. My 
original patch was to fix this, making sure that the projected impression is 
pointing to the right operator. Please let me know your further thoughts.

    |---EventsPerMinute: (Name: LOForEach Schema: 
timeStamp#141:long,nbDevices#142:long,nbDevicesWatching#143:long)
        |   |
        |   (Name: LOGenerate[false,false,false] Schema: 
timeStamp#141:long,nbDevices#142:long,nbDevicesWatching#143:long)ColumnPrune:InputUids=[135,
 134]ColumnPrune:OutputUids=[141, 143, 142]
        |   |   |
        |   |   (Name: Multiply Type: long Uid: 141)
        |   |   |
        |   |   |---group:(Name: Project Type: long Uid: 134 Input: 0 Column: 
(*))
        |   |   |
        |   |   |---(Name: Cast Type: long Uid: 139)
        |   |       |
        |   |       |---(Name: Constant Type: int Uid: 139)
        |   |   |
        |   |   (Name: UserFunc(org.apache.pig.builtin.BagSize) Type: long Uid: 
142)
        |   |   |
        |   |   |---DistinctDevices:(Name: Project Type: bag Uid: 135 Input: 1 
Column: (*))
        |   |   |
        |   |   (Name: UserFunc(org.apache.pig.builtin.BagSize) Type: long Uid: 
143)
        |   |   |
        |   |   |---DistinctDevices:(Name: Project Type: bag Uid: 135 Input: 1 
Column: (*))
        |   |
        |   |---(Name: LOInnerLoad[0] Schema: group#134:long)
        |   |
        |   |---DistinctDevices: (Name: LOFilter Schema: 
eventTime#106:long,deviceId#107:chararray,eventName#108:chararray)
        |       |   |
        |       |   (Name: Equal Type: boolean Uid: 138)
        |       |   |
        |       |   |---eventName:(Name: Project Type: chararray Uid: 108 
Input: 0 Column: 2)
        |       |   |
        |       |   |---(Name: Constant Type: chararray Uid: 137)
        |       |
        |       |---Events: (Name: LOInnerLoad[1] Schema: 
eventTime#106:long,deviceId#107:chararray,eventName#108:chararray)

                
> Alias reuse in nested foreach causes PIG script to fail
> -------------------------------------------------------
>
>                 Key: PIG-3379
>                 URL: https://issues.apache.org/jira/browse/PIG-3379
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.11.1
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>         Attachments: PIG-3379-draft.patch, PIG-3379.patch
>
>
> The following script fails:
> {code:title=temp.pig}
> Events = LOAD 'x' AS (eventTime:long, deviceId:chararray, 
> eventName:chararray);
> Events = FOREACH Events GENERATE eventTime, deviceId, eventName;
> EventsPerMinute = GROUP Events BY (eventTime / 60000);
> EventsPerMinute = FOREACH EventsPerMinute {
>   DistinctDevices = DISTINCT Events.deviceId;
>   nbDevices = SIZE(DistinctDevices);
>   DistinctDevices = FILTER Events BY eventName == 'xuaHeartBeat';
>   nbDevicesWatching = SIZE(DistinctDevices);
>   GENERATE $0*60000 as timeStamp, nbDevices as nbDevices, nbDevicesWatching 
> as nbDevicesWatching;
> }
> EventsPerMinute = FILTER EventsPerMinute BY timeStamp >= 0  AND timeStamp < 
> 100000;
> A = FOREACH EventsPerMinute GENERATE timeStamp;
> describe A;
> {code}
> With the error:
> {code}
> 2013-07-16 11:31:20,450 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1025: 
> <file /home/xzhang/Documents/temp.pig, line 14, column 37> Invalid field 
> projection. Projected field [timeStamp] does not exist in schema: 
> deviceId:chararray.
> {code}
> Using distinct alias name for the 2nd "DistinctDevices" fixes the problem. As 
> an observation, removing the last filter statement also fixes the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to