I suspect the honest answer would be "because BigTable paper had it" :P
There are several aspects to cell versioning (I may be missing some). First (not the most important), due to the way HBase stores things (write-once files), it comes very cheaply - very little runtime cost, and not so much code needs to be written to have it. Second, internally, versioning allows for snapshot isolation (within a server) to work - with multiple versions present, scanners can read all ones to get a consistent view (that's MVCC). Third, user-visible, timestamp-based cell versioning is there so that users could control the order of things (e.g. delete all cells before...), either thru fabricated timestamps, or using external timestamps, e.g. from external logs. In fact, with current HBase implementation of auto-ts (there are JIRAs to fix it), that's the only "bulletproof" way to use HBase; internal HBase versioning relies on server clocks, which is fraught with peril (granted, most systems will rarely hit this problems, and may be ok with some reordering anyway). Fourth, multi-versions as such could be used for some application specific scenarios, Percolator paper is a good example. On Sun, Dec 8, 2013 at 9:35 AM, Michael Segel <[email protected]>wrote: > > Hi, > > In a different thread, we were discussing good and better schema designs. > In order to really understand why one should or should not do something, > its kind of important to understand the underlying reasons why HBase was > designed the way it was. > > So since we have a bunch of committers here, and cc'ing the Dev list, > > I'd like to explore why does HBase have cell versioning. What's its > purpose. How is it implemented. and Why. > > This may seem a bit esoteric, but it would go a long way in educating many > of the users on the hbase mailing list. > > Also it may be a good couple of paragraphs to add to the online > reference... > > -Mike > > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
