Hi Gabriel. Thanks for that. It seemed a bit wrong to *not* be using combineValues, but a DoFn works just fine.
Cheers, Dave On 5 April 2013 16:48, Gabriel Reid <[email protected]> wrote: > Hi Dave, > > > On Fri, Apr 5, 2013 at 5:05 PM, Dave Beech <[email protected]> wrote: >> >> I have a PGroupedTable<A,B> and I want to aggregate / combine the values >> to produce a PCollection<C> - in other words, I need the type of the >> aggregate to be different to the original value type. >> >> What's the best approach? The combineValues method takes either an >> Aggregator or a CombineFn but as far as I can see, both of these assume the >> end result will be of the same type as the values. >> >> > The approach that I always use for this is just creating a custom DoFn to > operate on the PGroupedTable and construct the instance of type C based in > the incoming Iterable fromt he PGroupedTable. This basically works out to > the same as a Aggregator. > > I don't think that this scenario would be technically applicable to a > CombineFn, because the CombineFn can be called any number of times on an > incoming set of values, on both the map and reduce sides of a job. In order > to map values to another type, the intermediate value of type C would > somehow need to be given to the CombineFn each time it was used. > > - Gabriel > >
