Hello Micah, Yes we are using MapFn now. That aggregation and computation is being done in reduce phase. As CombineFn after GBK runs into map side, then those most computations can be done in map side which are now running in reduce phase. Some smaller aggregations and computations can be done on reduce phase. My point was to do some aggregation (and create a new object) in map phase instead of in reduce phase.
Thanks, Chandan On Thu, Oct 17, 2013 at 3:48 PM, Micah Whitacre <[email protected]> wrote: > Chandan, > I think what you are wanting will just be a simple MapFn instead of a > CombineFn. The doc of the CombineFn[1] sounds like what you want with the > statement "A special > DoFn<http://crunch.apache.org/apidocs/0.7.0/org/apache/crunch/DoFn.html> > implementation > that converts an > Iterable< > http://download.oracle.com/javase/6/docs/api/java/lang/Iterable.html?is-external=true > > > of > values into a single value" but it is expecting the value to be of the same > time. Since you are wanting to combine the values into a different form it > should be fairly trivial to write a MapFn that converts the Iterable<T> -> > U. > > [1] - > http://crunch.apache.org/apidocs/0.7.0/org/apache/crunch/CombineFn.html > > > On Thu, Oct 17, 2013 at 3:30 PM, Chandan Biswas <[email protected] > >wrote: > > > I was trying to refactoring some stuffs and trying to use combineFn. > > But when I went into deeper, found that I can't do it as Crunch doesn't > > allow it the functionality I needed. For example, I have a > > PGroupedTable<S,T>. I wanted to apply CombineFn<S,T> on it and wanted to > > get PCollection<S,U> instead of T. Right now, CombineFn allows only same > > type as return value. The use case of this need is that there will be > some > > time saving in sorting. It's natural that when aggregating some objects > at > > map side can create a new different type object. > > > > Any thought on it? Am I missing any thing? If this can be written in > > different way using existing way please let me know. > > > > Thanks > > Chandan > > >
