What's the max versions for table 't1' ?
When you issue 'describe' command, you would see something similar to the
following:

VERSIONS => '1'



On Mon, Dec 9, 2013 at 4:47 PM, Niels Basjes <[email protected]> wrote:

> Hi,
>
> When I first started learning about HBase I compared the logic of setting
> new values to something that is similar to the way a tool like Subversion
> works: When you set a new value you don't overwrite the old one, you simply
> create a new version.
> Just like subversion you can then at a later moment retrieve the old value
> that way the situation at an earlier date.
>
> (The only real variation to the SVN model is that HBase only retains the
> last N versions of a cell.)
>
> There is however one situation where this comparison really fails: When you
> do a delete on a cell.
> If you want to retrieve the state of a thing from subversion and in the
> current version this thing has been deleted then you can still get it back.
> With HBase however if you delete a cell you place a tombstone at a specific
> time and as such internally the older values are still present.
>
> But when you try to retrieve such an older value then you still get an
> empty result back (i.e. no such cell).
> The direct consequence of the currently implemented model is that an
> application can never retrieve the correct state of a row at an older
> timestamp if a delete on any cell has occurred.
>
> Example:
>
> I create a table with one row:
>
> > create 't1', 'cf'
> > put 't1', 'rowid', 'cf:1', 'One', 1000
> > put 't1', 'rowid', 'cf:2', 'Two', 2000
> > put 't1', 'rowid', 'cf:3', 'Three', 3000
> > get 't1', 'rowid' , {TIMERANGE => [0,3500]}
>
>     COLUMN                     CELL
>      cf:1                      timestamp=1000, value=One
>      cf:2                      timestamp=2000, value=Two
>      cf:3                      timestamp=3000, value=Three
>     3 row(s) in 0.0150 seconds
>
> Then the delete of a cell at a later timestamp:
>
> > delete 't1', 'rowid', 'cf:1', 4000
>
> Now if I retrieve the row at time 3500 I would find it logical that I would
> still see the same values as I would above.
> This is however the reality:
>
> > get 't1', 'rowid' , {TIMERANGE => [0,3500]}
>
>     COLUMN                     CELL
>      cf:2                      timestamp=2000, value=Two
>      cf:3                      timestamp=3000, value=Three
>     2 row(s) in 0.0120 seconds
>
>
> Why has it been designed/implemented like this?
> What is the logic behind this model?
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes
>

Reply via email to