Hi Yes I was talking about the dead entry in the index table rather than the actual data table.
Regards Ram > -----Original Message----- > From: Wei Tan [mailto:w...@us.ibm.com] > Sent: Tuesday, August 28, 2012 9:22 PM > To: dev@hbase.apache.org > Cc: Sandeep Tata > Subject: Re: A general question on maxVersion handling when we have > Secondary index tables > > Thanks for sharing a pointer to your implementation. > My two cents: > timestamp is a way to do MVCC and setting every KV with the same TS > will > get concurrency control very tricky and error prone, if not impossible > I think Ram is talking about the dead entry in the index table rather > than > data table. Deleting old index entries upfront when there is a new put > might be a choice. > > > Best Regards, > Wei > > Wei Tan > Research Staff Member > IBM T. J. Watson Research Center > 19 Skyline Dr, Hawthorne, NY 10532 > w...@us.ibm.com; 914-784-6752 > > > > From: Jesse Yates <jesse.k.ya...@gmail.com> > To: dev@hbase.apache.org, > Date: 08/28/2012 04:00 AM > Subject: Re: A general question on maxVersion handling when we > have > Secondary index tables > > > > Ram, > > If I understand correctly, I think you can design your index such that > you > don't actually use the timestamp (e.g. everything gets put with a TS = > 10 > - > or some other non-special, relatively small number that's not 0 as I'd > worry about that in HBase ;) Then when you set maxVersions to 1, > everything > should be good. > > You get a couple of wasted bytes from the TS, but with the prefixTrie > stuff > that should be pretty minimal overhead. If you do need to keep track of > the > timestamp you should be able to munge that back up into the column > qualifier (and just know that that last 64 bits is the timestamp). > Again a > little more CPU cost, but its really not that big of an overhead. It > seems > like you don't really care about the TS though, in which case this > should > be pretty simple. > > Out of curiosity, what are people using for their secondary indexing > solutions? I know there are a bunch out there, but don't know what > people > have adopted, what they like/dislike, design tradeoffs made and why. > > Disclaimer: I recently proposed a secondary indexing solution myself > (shameless self-plug: > http://jyates.github.com/2012/07/09/consistent-enough-secondary- > indexes.html > ) > and its something I'm working on for Salesforce - open sourced at some > point, promise! > > -Jesse > ------------------- > Jesse Yates > @jesse_yates > jyates.github.com > > > On Tue, Aug 28, 2012 at 12:24 AM, Ramkrishna.S.Vasudevan < > ramkrishna.vasude...@huawei.com> wrote: > > > Hi All > > > > > > > > When we try to build any type of secondary indices for a given table > how > > can > > one handle maxVersions in the secondary index tables. > > > > > > > > For eg, > > > > I have inserted > > > > Row1 - Val1 => t > > > > Row1 - Val2 => t+1 > > > > Row1 - Val3. => t+2 > > > > > > > > Ideally if my max versions is only one then Val3 should be my result > If > I > > query on main table for row1. > > > > > > > > Now in my index I will be having all the above 3 entries. Now how > can > we > > remove the older entries from the index table that does not fit into > > maxVersions. > > > > > > > > Currently while scanning and the code that avoids the max Versions > does > not > > give any hooks to know the entries skipped thro versions. > > > > So any suggestions on this, I am still seeing the code for any other > > options > > but suggestions welcome. > > > > > > > > Regards > > > > Ram > > > >