[jira] Created: (PIG-920) optimizing diamond queries

Olga Natkovich (JIRA) Thu, 13 Aug 2009 10:13:38 -0700

optimizing diamond queries
--------------------------

                 Key: PIG-920
                 URL: https://issues.apache.org/jira/browse/PIG-920
             Project: Pig
          Issue Type: Improvement
            Reporter: Olga Natkovich



The following query

A = load 'foo';
B = filer A by $0>1;
C = filter A by $1 = 'foo';
D = COGROUP C by $0, B by $0;
......

does not get efficiently executed. Currently, it runs a map only job that 
basically reads and write the same data before doing the query processing.

Query where the data is loaded twice actually executed more efficiently.

This is not an uncommon query and we should fix this issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (PIG-920) optimizing diamond queries

Reply via email to