[
https://issues.apache.org/jira/browse/PIG-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Olga Natkovich updated PIG-1735:
--------------------------------
Fix Version/s: (was: 0.9.0)
> Use combiner in cogroup
> -----------------------
>
> Key: PIG-1735
> URL: https://issues.apache.org/jira/browse/PIG-1735
> Project: Pig
> Issue Type: Improvement
> Reporter: Thejas M Nair
> Assignee: Thejas M Nair
>
> As reported by Scott Carey in PIG-479, combiner does not get used for
> co-group, even if the functions applied on the bags are algebraic . -
> Quoting from the comment -
> "For example, I'm not quite sure why this one doesn't use a combiner - it
> reads ~350x as much input bytes from HDFS as its reduce output, a combiner
> would be very effective:
> J = COGROUP
> UV BY (s, d, h, g, p, pa, st) OUTER,
> UC BY (s, d, h, g, p, pa, st) OUTER,
> AT BY (s, d, h, g, p, pa, st) OUTER,
> V BY (s, d, h, g, p, pa, st) OUTER,
> C BY (s, d, h, g, p, pa, st) OUTER;
> OUTPUT = FOREACH J GENERATE
> FLATTEN(group) as (s, d, h, g, p, pa, st),
> COUNT_STAR(C) as c,
> COUNT_STAR(V) as v,
> SUM(AT.p1) as p1,
> SUM(AT.p2) as p2,
> SUM(AT.p3) as p3,
> SUM(UC.q) as ucq,
> SUM(UC.r) as ucr,
> SUM(UV.q) as uvq,
> SUM(UV.r) as uvr;
> "
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira