[ 
https://issues.apache.org/jira/browse/PIG-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891789#action_12891789
 ] 

Olga Natkovich commented on PIG-1249:
-------------------------------------

Jeff, sorry this patch did not get much attention in a while. Can I ask you to 
do the following:

(1) Regenrate the patch for the latest trunk and make sure that the tests are 
passing and we get no additional warnings
(2) Add a docs comment that describes in one place what are the exact 
heuristics, when they are applied and how they can be influenced. I will ask 
our doc writer to incorporate this information in Pig 0.8.0 documentation
(3) If it is not already done, can we log the value that will be used so that 
the user knows what is happenning

Thanks!

> Safe-guards against misconfigured Pig scripts without PARALLEL keyword
> ----------------------------------------------------------------------
>
>                 Key: PIG-1249
>                 URL: https://issues.apache.org/jira/browse/PIG-1249
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Arun C Murthy
>            Assignee: Jeff Zhang
>            Priority: Critical
>             Fix For: 0.8.0
>
>         Attachments: PIG-1249-4.patch, PIG-1249.patch, PIG_1249_2.patch, 
> PIG_1249_3.patch
>
>
> It would be *very* useful for Pig to have safe-guards against naive scripts 
> which process a *lot* of data without the use of PARALLEL keyword.
> We've seen a fair number of instances where naive users process huge 
> data-sets (>10TB) with badly mis-configured #reduces e.g. 1 reduce. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to