Hi, Last year at a meetup I spoke with Lars George about the counters in hbase. What I understood is that the counters are stored as increments (i.e. increment without locking) and during compaction and querying a the increments are aggregated into the actual value.
So far I've examined the API and this seems to work as long as the value is a long. Now incrementing longs is nice but I would like to do things like - Calculating min, max - Bloomfilters - Average ( recording both the "count" and "sum" ) - Variance and Standard Deviation ( using http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm) All of those need more bytes of internal storage and need custom code for storing, aggregating and querying. Especially querying because perhaps I can ask several different questions to a single byte[]. If I store both the count and the sum in a single byte[] then I can ask getN(), getSum(), getAvg() Now my question to you guys is how I can implement such a more generic form of "lock free increments" with user defined setters, getters and a custom aggregator (used for both compacting and querying). Perhaps there is an example on how to do this? -- Best regards / Met vriendelijke groeten, Niels Basjes
