Use distributed cache to store samples
--------------------------------------
Key: PIG-1218
URL: https://issues.apache.org/jira/browse/PIG-1218
Project: Pig
Issue Type: Improvement
Reporter: Olga Natkovich
Assignee: Richard Ding
Fix For: 0.7.0
Currently, in the case of skew join and order by we use sample that is just
written to the dfs (not distributed cache) and, as the result, get opened and
copied around more than necessary. This impacts query performance and also
places unnecesary load on the name node
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.