Shravan Matthur Narayanamurthy commented on PIG-615:

Should I submit the changes I suggested in our discussion as a patch?

A summary of the discussion follows:
As per the current logic, the generation of the 4th MR Job in case of limit 
depends on the use of *"parallel"* keyword.
Though the logic is not directly dependent on cluster configuration, some 
cluster configs require this 4th MR job and some don't.
For ex., if the cluster is configured to set number of reducers to one if 
parallelism is -1 or unspecified then our current logic will work as the 4th MR 
Job is redundant. 
However, if the cluster is configured to set number of reducers to some other 
number, like 0.9 times the number of reduce slots if parallelism is unspecified 
then the 4th MRJob is necessary.

That is, we are making an implicit assumption in the code that if parallel is 
not explicitly mentioned, then the number of reducers is equal to 1. So the 
logic needs to be changed to include the 4th MRJob whenever the parallelism is 
not explicitly set to 1 so that we will produce correct results though in some 
cases its use might be redundant.

> Wrong number of jobs with limit
> -------------------------------
>                 Key: PIG-615
>                 URL: https://issues.apache.org/jira/browse/PIG-615
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Olga Natkovich
>            Assignee: Shravan Matthur Narayanamurthy

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to