Pradeep Kamath updated PIG-563:

    Attachment: PIG-563-v3.patch

COUNT.Initial was implemented that way so that in case it is called in the 
non-combiner case in the reduce, it would produce the right result. However 
since currently we plan to call COUNT.initial only when the combine plan is 
also present, we can be guaranteed it is called only in the Map - So I have 
changed it to emit 1 as suggested in the review comment in the new version in 
the attachment.

> PERFORMANCE: enable combiner to be called 0 or more times whenver the 
> combiner is used for a pig query
> ------------------------------------------------------------------------------------------------------
>                 Key: PIG-563
>                 URL: https://issues.apache.org/jira/browse/PIG-563
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: types_branch
>         Attachments: PIG-563-v2.patch, PIG-563-v3.patch, PIG-563.patch
> Currently Pig's use of the combiner assumes the combiner is called exactly 
> once in Hadoop. With Hadoop 18, the combiner could be called 0, 1 or more 
> times. This issue is to track changes needed in the CombinerOptimizer visitor 
> and the builtin Algebraic UDFS (SUM, COUNT, MIN, MAX, AVG) to be able to work 
> in this new model.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to