Hi Frédéric, Have you looked at http://hbase.apache.org/book/versions.html ? What you want to do, if I undesrtand correctly, is already part of the hbase features... This: http://outerthought.org/blog/417-ot.html can be interesting too.
JM 2012/6/21, Frédéric Fondement <[email protected]>: > Hi all ! > > Before I start, I'd like to have some feedback about TTL performance in > HBase. > > My use case is the following. I have constantly data coming in the base > (i.e. a write-instensive application). This data should be kept during a > certain amount of time, either 3, 6, 12... monthes, depending on some > external conditions. I can live with some data registered to live only 3 > monthes even if conditions eventually change to 6 months. > > I can see three options here: > > opt. 1: indexing in a secondary table using salted timestamp as a > key (this is not a problem in my case) > opt. 2: creating different tables like > 'to-be-destroyed-in-august-2012', 'to be destroyed-in-june-2012'... and > then merely killing them with a cron job > opt. 3: creating tables like 'to-be-destroyed-in-3-monthes' (with a > 3 monthes TTL), 'to-be-destroyed-in-6-monthes' (with a 6 monthes TTL)... > > What do you think is the most efficient ? > opt1. overloads a little bit more my already write intensive context > opt2. looks nice (regarding deletion), but to read, I need to scan at > least 12 different tables, and each month, my data will be buffered > during table creation (and region splitting ! which I still don't really > know how to choose split keys) > opt3. looks the nicest (only 3-4 tables to scan when reading), but won't > my daily major compact become crazy ? > > It would be great having some clue before doing the job :-) ! > > Best regards, > > Frédéric. > >
