[ 
https://issues.apache.org/jira/browse/HBASE-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859925#action_12859925
 ] 

Kevin Peterson commented on HBASE-2406:
---------------------------------------

I lean towards there being no notion of ordering other than timestamps.

If multiple writes to a cell have the same timestamp, one of those versions 
will be maintained, and it is undefined which version will be maintained.

If the user writes to cells with out of order timestamps, and the writes would 
make the cell exceed the number of versions the column family stores, the cell 
will contain those versions with the highest timestamp. More formally:

A column family retains N versions.
Given a cell C storing a possibly empty set of versions and timestamps S = { 
(v1, ts1), (v2, ts2), ... (vn, tsn) }, n <= N.
The user makes m writes to C W = { (v1', ts1'), (v2', ts2'), ... (vm', tsm') }.
If m + n <= N, C will retain all writes as versions.
If m + n > N, C will contain those N (v, ts) from S union W with the highest ts.

If the user writes to cells with a timestamp before the current time minus the 
column family's TTL, the write will be discarded.

I wonder if there are any uses of timestamps that we can recommend without 
forcing people to understand all the details. Here's what I can think of:
1. If your data model has a preexisting timestamp, and this timestamp never 
changes manually set timestamps will be more convenient than serializing a 
timestamp in your own format.
2. If your data model has a preexisting timestamp, and this timestamp changes 
in a way compatible with the behavior of HBase, and you understand the details 
of timestamps and versioning, manually set timestamps will be more convenient 
than serializing a timestamp in your own format.
3. If you want a consistent version of some data that spans multiple tables 
(i.e. secondary index), you may want to use the same timestamp to insert into 
both tables so that you can use the exact timestamp as part of a get() after 
reading it out of one table.
4. If your data includes a meaningful timestamp, especially if that timestamp 
can change, you may find it more straightforward to store that timestamp in 
your own format rather than relying on HBase timestamps.

Likely point of confusion is if the user ignores the details of versioning, and 
always queries for the most recent timestamp, the user could have a mental 
model like this:

- A cell stores a value and a timestamp
- I can supply the timestamp when I write a value
- I can read the timestamp
- *I can update the timestamp (incorrect)

The user can write the value with a higher timestamp which appears to update 
the timestamp. When the user tries the same thing with a lower timestamp, this 
doesn't work.

> Define semantics of cell timestamps
> -----------------------------------
>
>                 Key: HBASE-2406
>                 URL: https://issues.apache.org/jira/browse/HBASE-2406
>             Project: Hadoop HBase
>          Issue Type: Task
>          Components: documentation
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.21.0
>
>
> There is a lot of general confusion over the semantics of the cell timestamp. 
> In particular, a couple questions that often come up:
> - If multiple writes to a cell have the same timestamp, are all versions 
> maintained or just the last?
> - Is it OK to write cells in a non-increasing timestamp order?
> Let's discuss, figure out what semantics make sense, and then move towards 
> (a) documentation, (b) unit tests that prove we have those semantics.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to