[ 
https://issues.apache.org/jira/browse/PIG-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates reassigned PIG-750:
------------------------------

    Assignee: Thejas M Nair

Our performance tests have shown that having combiner and non-combiner 
functions in the same MR job actually severly slows things down.  We suspect 
that this is because you have to pass the bags for the non-combiner functions 
through the combiner and you pay for the multiple (de)serialization passes.

However, the other things noted in this bug, such as the need to use the 
combiner when algebraic UDFs are involved in simple expressions is valid, and 
is along the lines of issues Thejas is working on for the combiner.  So I'm 
assigning the issue to him.

> Use combiner when a mix of algebraic and non-algebraic functions are used
> -------------------------------------------------------------------------
>
>                 Key: PIG-750
>                 URL: https://issues.apache.org/jira/browse/PIG-750
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Amir Youssefi
>            Assignee: Thejas M Nair
>            Priority: Minor
>
> Currently Pig uses combiner when all a,b, c,... are algebraic (e.g. SUM, AVG 
> etc.) in foreach:
> foreach X generate a,b,c,... 
>  It's a performance improvement if it uses combiner when a mix of algebraic 
> and non-algebraic functions are used as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to