[ 
https://issues.apache.org/jira/browse/PIG-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865631#action_12865631
 ] 

Scott Carey commented on PIG-466:
---------------------------------

This is both a performance and usability issue.

If the optimizer could automatically push projections up to the earliest 
possible time, it would also unclutter large scripts that manually project 
'early and often' for performance reasons.  

I have reason to believe that some of these extra lines of projection 
interferes with certain other performance optimizations as well (on 0.5, 
multi-query optimization sometimes fails due to extra projections in between, 
some forms of projection break combiner use as well).


> PERFORMANCE: dropping the columns as soon as possible
> -----------------------------------------------------
>
>                 Key: PIG-466
>                 URL: https://issues.apache.org/jira/browse/PIG-466
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.2.0
>            Reporter: Olga Natkovich
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>
> Currently, each operator carries all the data until foreach is encountered. 
> This can cause significant performance degradation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to