Hi, We have a simple HBase schema: row key = subscriber id. Column family A = counters - all kinds of aggregations.
Events records have a UUID, in some scenarios we might get duplicate events. We should not count the duplicates. A possible solution was to keep event ids as qualifiers in another CF and do checkAndIncrement only if can't find the event id. I understand how to utilize RegionObserver to solve the problem. Any other suggestions ? Thanks, Lior. On Sun, Apr 28, 2013 at 10:55 PM, Asaf Mesika <[email protected]> wrote: > Yep. > You can write a RegionObserver which take all event qualifiers with a time > stamp larger than a certain grace period, sum it up, add it to the current > value of the Count qualifier and emits an updated Count qualifier. > I wrote something very similar for us at Akamai and it improved throughput > by x10. I'm working on open sourcing it. > > On Saturday, April 27, 2013, Lior Schachter wrote: > > > Hi Ted, > > Thanks for the prompt response. > > I've already had a look at HRegionServer.checkAndPut and the > implementation > > looks quite straight forward. > > That's why I was wondering why the other 2 methods are not available...or > > planned (couldn't find Jira). > > Seems like a useful functionality. > > > > Anyhow, I'm not allowed to make any source code modifications to the > HBase > > installation (in production) so I reckon I'll have to find a workaround. > > > > This is my use case: > > Updating user counters by events. > > We may get (in rare cases) duplicate events. > > Should not count the duplicates. > > > > My initial thought was to have an event_id qualifier for each incoming > > event (with '1' value). By checking if event_id exists before > incrementing > > I can avoid duplicates. > > Without the checkAndIncrement functionality I must make 2 round trips for > > each event (which doesn't make sense). > > > > Any ideas how I can solve this issue ? > > > > Thanks, > > Lior > > > > > > > > > > > > > > > > > > On Sat, Apr 27, 2013 at 4:23 PM, Ted Yu <[email protected] > <javascript:;>> > > wrote: > > > > > Take a look at the following method in HRegionServer: > > > > > > public boolean checkAndPut(final byte[] regionName, final byte[] row, > > > final byte[] family, final byte[] qualifier, final byte[] value, > > > final Put put) throws IOException { > > > > > > You can create checkAndIncrement() in a similar way. > > > > > > Cheers > > > > > > On Sat, Apr 27, 2013 at 9:02 PM, Lior Schachter <[email protected] > <javascript:;>> > > wrote: > > > > > > > Hi, > > > > I want to increment a cell value only after checking a condition on > > > another > > > > cell. I could find checkAndPut/checkAndDelete on HTableInteface. It > > seems > > > > that checkAndIncrement (and checkAndAppend) are missing. > > > > > > > > Can you suggest a workaround for my use-case ? working with version > > > > 0.94.5. > > > > > > > > Thanks, > > > > Lior > > > > > > > > > >
