[ 
https://issues.apache.org/jira/browse/PIG-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914663#action_12914663
 ] 

Thejas M Nair commented on PIG-1642:
------------------------------------

Comments on the patch -
- In SampleOptimizer.java It expects the sampling MR plan to have only one 
integer argument which has information about the number of reducers that will 
be used in the successor of sampling job (order-by/skewed-join). We might not 
remember this assumption if we make changes to the sampling plan, so it will be 
safer to throw an error if more than one integer constant is seen in the plan.
- In test case, the expected number of reducers is being computed dynamically 
and used for checking in first scenario, it can be used it in last scenario as 
well.


> Order by doesn't use estimation to determine the parallelism
> ------------------------------------------------------------
>
>                 Key: PIG-1642
>                 URL: https://issues.apache.org/jira/browse/PIG-1642
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>             Fix For: 0.8.0
>
>         Attachments: PIG-1642.patch, PIG-1642_1.patch, PIG-1642_1.patch
>
>
> With PIG-1249, a simple heuristic is used to determine the number of reducers 
> if it isn't specified (via PARALLEL or default_parallel). For order by 
> statement, however, it still defaults to 1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to