PERFORMANCE: Removing Union from map side of query with COGROUP
---------------------------------------------------------------

                 Key: PIG-409
                 URL: https://issues.apache.org/jira/browse/PIG-409
             Project: Pig
          Issue Type: Improvement
            Reporter: Olga Natkovich


Currently, the map side code is not aware which side of the cogroup it is 
processing so it assumes that it processes all by putting a union at the end of 
the pipeline. This is fairly inefficient.

A better approach would be to figure out which file is processed in confiugre 
call. There seems to be away to do this with hadoop but it is not documented so 
might not be guaranteed - need to follow up with somebody from hadoop project.

Another approach is to check it the first time map is called and to pick the 
execution plan that matches that part.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to