Thanks for sharing. You're welcome to contribute this extension to the community.
On Feb 28, 2013, at 4:10 AM, Julian Wissmann <[email protected]> wrote: > Hi, > > for a research project I wrote a custom coprocessor, for which > ultimately, I just extended AggregationClient and > AggregateImplementation. > I needed two additional input parameters, a Long timePeriod over which > to aggregate and an int count to know how many aggregations to return, > the return value being a ConcurrentMap. The advantage of going with > this approach is, that for aggregations with a large count but small > periods, I don't need to either sort the data on the client side again > or do n aggregations resulting in n scans, but instead get the same > result with just one scan per region, which is a lot faster. > > Right now, the code is a little messy, as it was implemented as a > quick and dirty proof of concept. However if there is interest in > having this ability in hbase, I'd be pleased to clean it up, port it > to head and release it into the wild. > I realize, that not everyone has a data pattern as simple as ours and > that this feature may not be overly useful to everyone. If anyone has > an idea as to how to extend this functionality to make it more useful, > let me know. I'm for example thinking about maybe having a more > generic approach with some sort of Filter or something along the lines > in order to not just being able to sort this for time periods but for > key patterns also. > > Regards > Julian
