You are right; you have to patch the code in the aggregate package.
On Fri, Apr 24, 2009 at 10:24 AM, Dan Milstein <dmilst...@hubteam.com>wrote: > Runping, > > Thanks for the response. A question about case (2) below, (which is, in > fact, what I want to do): > > - Is there any way to do this without patching the code within the > aggregator package? > > It sure doesn't look like it, but just to make sure. > > Thanks again, > -Dan M > > > On Apr 24, 2009, at 12:56 PM, Runping Qi wrote: > > A couple of general goals behind of the aggregate package: >> >> 1. If you are application developers using aggregate package, you only >> need >> to develop your own (user defined) valuator descriptor classes, which are >> typically sub class of ValueAggregatorDescriptor. You can use >> the existing aggregator types (such as LongValueSum, ValueHistogram, >> etc.) >> >> 2. If you want to contribute new types of aggregator (for example, an >> ValueAverage class that keeps track the average of values will be a much >> needed one), then you need to implement a class tham implements >> ValueAggregator class, and to update the generateValueAggregator method of >> ValueAggregatorBaseDescriptor to handle your new aggregators. >> >> 3. If you want to contribute to the aggregate framework itsself, you may >> need to touch every bit of the code in the package. >> >> Runping >> >> >> >> On Thu, Apr 23, 2009 at 1:44 PM, Dan Milstein <dmilst...@hubteam.com> >> wrote: >> >> Hello all, >>> >>> I've been using streaming + the aggregate package (available via -reducer >>> aggregate), and have been very happy with what it gives me. >>> >>> I'm interested in writing my own new aggregate functions (in Java) which >>> I >>> could then access from my streaming code. >>> >>> Can anyone give me pointers towards how to make that happen? I've read >>> through the aggregate package source, but I'm not seeing how to define my >>> own, and get access to it from streaming. >>> >>> To be specific, here's the sort of thing I'd like to be able to do: >>> >>> - In Java, define a SampleValues aggregator, which chooses a sample of >>> the >>> input given to it >>> >>> - From my streaming program, in say python, output: >>> >>> SampleValues:some_key \t some_value >>> >>> - Have the aggregate framework somehow call my new aggregator for the >>> combiner and reducer steps >>> >>> Thanks, >>> -Dan Milstein >>> >>> >