PERFORMANCE: Removing Union from map side of query with COGROUP
---------------------------------------------------------------
Key: PIG-409
URL: https://issues.apache.org/jira/browse/PIG-409
Project: Pig
Issue Type: Improvement
Reporter: Olga Natkovich
Currently, the map side code is not aware which side of the cogroup it is
processing so it assumes that it processes all by putting a union at the end of
the pipeline. This is fairly inefficient.
A better approach would be to figure out which file is processed in confiugre
call. There seems to be away to do this with hadoop but it is not documented so
might not be guaranteed - need to follow up with somebody from hadoop project.
Another approach is to check it the first time map is called and to pick the
execution plan that matches that part.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.