[ 
https://issues.apache.org/jira/browse/PIG-920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772066#action_12772066
 ] 

Richard Ding commented on PIG-920:
----------------------------------

Add additional comments.

> optimizing diamond queries
> --------------------------
>
>                 Key: PIG-920
>                 URL: https://issues.apache.org/jira/browse/PIG-920
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Olga Natkovich
>            Assignee: Richard Ding
>         Attachments: PIG-920.patch, PIG-920.patch
>
>
> The following query
> A = load 'foo';
> B = filer A by $0>1;
> C = filter A by $1 = 'foo';
> D = COGROUP C by $0, B by $0;
> ......
> does not get efficiently executed. Currently, it runs a map only job that 
> basically reads and write the same data before doing the query processing.
> Query where the data is loaded twice actually executed more efficiently.
> This is not an uncommon query and we should fix this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to