[
https://issues.apache.org/jira/browse/PIG-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pradeep Kamath updated PIG-1218:
--------------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
Committed patch PIG-1218_2.patch since the merge join changes need to be
re-worked and will be handled in a different patch.
Thanks Richard!
> Use distributed cache to store samples
> --------------------------------------
>
> Key: PIG-1218
> URL: https://issues.apache.org/jira/browse/PIG-1218
> Project: Pig
> Issue Type: Improvement
> Reporter: Olga Natkovich
> Assignee: Richard Ding
> Fix For: 0.7.0
>
> Attachments: PIG-1218.patch, PIG-1218_2.patch, PIG-1218_3.patch
>
>
> Currently, in the case of skew join and order by we use sample that is just
> written to the dfs (not distributed cache) and, as the result, get opened and
> copied around more than necessary. This impacts query performance and also
> places unnecesary load on the name node
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.