Doug Meil may point you to related doc. Take a look at this as well: https://issues.apache.org/jira/browse/HBASE-4241
On Thu, Sep 29, 2011 at 11:22 AM, Jameson Lopp <[email protected]> wrote: > Hm, well I didn't mention a number of other requirements for the feature > I'm building, but long story short, I need to keep track of millions to > billions of these counters and need the lookup time to be as close to > constant time as possible, thus I was really hoping to avoid doing table > scans. > > I'll admit I know nothing of the dangers of auto-pruning; is there an > article / documentation I could read about it? Google wasn't very helpful. > > > -- > Jameson Lopp > Software Engineer > Bronto Software, Inc > > > On 09/29/2011 02:12 PM, Jean-Daniel Cryans wrote: > >> My advice usually regarding timestamps is if it's part of your data >> model, it should appear somewhere in an HBase key. 99% of the time >> overloading the HBase timestamps is a bad idea, especially with >> counters since there's auto-pruning done in the Memstore! >> >> I would suggest you make time part of your row key, maybe one counter >> per day, and then set the TTL on your table to 30 days. Then all you >> need to do is a sequential scan for those 30 days maybe with a prefix >> that refers to some event id. >> >> OpenTSDB is another way of doing it: http://opentsdb.net/ >> >> J-D >> >> On Thu, Sep 29, 2011 at 11:04 AM, Jameson Lopp<[email protected]> >> wrote: >> >>> I wish to store a count of 30-day trailing event data (e.g. # of clicks >>> in >>> past 30 days) and ended up reading the documentation for setTimeRange in >>> the >>> Increment operation. >>> http://hbase.apache.org/**apidocs/org/apache/hadoop/** >>> hbase/client/Increment.html#**getTimeRange%28%29<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Increment.html#getTimeRange%28%29> >>> >>> I was hoping someone could clarify if it works as I'm imagining in this >>> example scenario. >>> >>> 1) Current click count is 0 >>> >>> 2) I process a click and I perform an increment operation with the time >>> range set to minStamp = now and maxStamp = 30 days from now >>> >>> 3) I query for the value immediately and find it to be 1 >>> >>> 4) Assuming no other clicks come in, if I query for the value in 31 days, >>> it >>> will be returned as 0 >>> >>> In essence, I'm looking for a way to set a TTL on my increment operation. >>> Is >>> this how it actually works? The documentation is a bit vague and I could >>> imagine several other scenarios. >>> -- >>> Jameson Lopp >>> Software Engineer >>> Bronto Software, Inc >>> >>>
