It really isn't documented anywhere. There is a small section in my book in
ch08 about it. It didn't make the alpha that is up of ch08 though.

On Thu, Apr 23, 2009 at 1:44 PM, Dan Milstein <dmilst...@hubteam.com> wrote:

> Hello all,
>
> I've been using streaming + the aggregate package (available via -reducer
> aggregate), and have been very happy with what it gives me.
>
> I'm interested in writing my own new aggregate functions (in Java) which I
> could then access from my streaming code.
>
> Can anyone give me pointers towards how to make that happen?  I've read
> through the aggregate package source, but I'm not seeing how to define my
> own, and get access to it from streaming.
>
> To be specific, here's the sort of thing I'd like to be able to do:
>
>  - In Java, define a SampleValues aggregator, which chooses a sample of the
> input given to it
>
>  - From my streaming program, in say python, output:
>
> SampleValues:some_key \t some_value
>
>  - Have the aggregate framework somehow call my new aggregator for the
> combiner and reducer steps
>
> Thanks,
> -Dan Milstein
>



-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422

Reply via email to