Use distributed cache to store samples
--------------------------------------

                 Key: PIG-1218
                 URL: https://issues.apache.org/jira/browse/PIG-1218
             Project: Pig
          Issue Type: Improvement
            Reporter: Olga Natkovich
            Assignee: Richard Ding
             Fix For: 0.7.0


Currently, in the case of skew join and order by we use sample that is just 
written to the dfs (not distributed cache) and, as the result, get opened and 
copied around more than necessary. This impacts query performance and also 
places unnecesary load on the name node

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to