Hi,

I couldn't seem to find sufficient documentation or examples of using
combiners in non-trivial ways. Say my map emits values of type Set<String>;
after grouping by key I want to emit the _size_ of the union of the sets of
strings, i.e., size(union(Iterable<Set<String>>))  Thus, the combiner's
type is Iterable<Set<String>> -> Set<String> but the reduce's type is
Iterable<Set<String>> -> Int

To my knowledge, both MapReduce and Spark allow a combiner to have a result
type different from reducer's.  However, unless I missed something, this is
not expressible in Crunch.  Shouldn't PGroupedTable.combineValues return
PGroupedTable to allow composition with mapValues?

Thanks,

stan

Reply via email to