Hi, Jean-Marc I am mot aware about implementation of #2 in HBase. In RocksDB there is a Merge operator which does exactly what you need. It can be done in HBase as well with a help of a specialized coprocessor. RocksDB Merge: https://github.com/facebook/rocksdb/wiki/Merge-Operator
-Vlad On Wed, Mar 13, 2019 at 6:41 AM Jean-Marc Spaggiari <jean-m...@spaggiari.org> wrote: > Hi, > > I have a quick question regarding aggregation. > > First, let me explain my understanding. I see two types of aggregation. > > First is at the column level. Like, AVG(age) on a table. It will, on the > server side, for each region, sum the age, and divide by the number of > rows. Fine. > > Second is at the cell level. Imagine I want a counter. I do multiple puts > for the exact same cell. At compaction time, or at read time, there will be > an aggregation that will return only the sum of all those cells. > > AggregateImplementation is an implementation of the first case. It runs as > a coprocessor EndPoint. > > Do we have an implementation of the 2nd one? There can be many different > implementations. For counters, were we just put what ever and get an > incremental number. For accumulator, where we put numbers and get the sum > of all the numbers we have put. For average, where we put numbers and get > the average of all the puts (cell will store something like "sum|count"). > etc. I looked at the existing coprocessors and I don't see anything like > that. Before starting to implement my own, I'm wondering if there is > already an existing solution. > > Thanks, > > JMS >