[
https://issues.apache.org/jira/browse/HBASE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846758#action_12846758
]
Todd Lipcon commented on HBASE-2294:
------------------------------------
I pushed an update to the doc in response to Stack's comments:
https://gist.github.com/336081/6c64c14c35fa778d74f3c7fdcfde09a38dc4b5c9
The time-travel thing is still somewhat worrisome. I think we have a few
options here:
a) We allow time travel reads always (a little weak, hard to program against).
To satisfy other guarantees, we know that writes and read-modify-writes won't
have this property.
b) We disallow time travel from a single client, but different clients may be
at different points in the timeline . That is to say, in the example of some
set of processes incrementing a cell, a single reader will never see a cell
decrease. However, a reader may see a cell at value N, communicate to a second
reader, and the second reader may then see the cell at a value less than N.
c) We give the user a call something like "ensureReadsUptodate()". This ensures
that the reader will not read any data more stale than the time when this call
is made. This is exactly what ZooKeeper does about the stale read problem -
usually you get stale reads but don't care, and if you care, you call ZK's
sync() method.
d) We never allow time travel reads. I think this is nearly impossible to do
without killing performance (essentially the region server would have to verify
that it is still in charge of a region before every read).
Thoughts?
> Enumerate ACID properties of HBase in a well defined spec
> ---------------------------------------------------------
>
> Key: HBASE-2294
> URL: https://issues.apache.org/jira/browse/HBASE-2294
> Project: Hadoop HBase
> Issue Type: Task
> Components: documentation
> Reporter: Todd Lipcon
> Priority: Blocker
> Fix For: 0.20.4, 0.21.0
>
>
> It's not written down anywhere what the guarantees are for each operation in
> HBase with regard to the various ACID properties. I think the developers know
> the answers to these questions, but we need a clear spec for people building
> systems on top of HBase. Here are a few sample questions we should endeavor
> to answer:
> - For a multicell put within a CF, is the update made durable atomically?
> - For a put across CFs, is the update made durable atomically?
> - Can a read see a row that hasn't been sync()ed to the HLog?
> - What isolation do scanners have? Somewhere between snapshot isolation and
> no isolation?
> - After a client receives a "success" for a write operation, is that
> operation guaranteed to be visible to all other clients?
> etc
> I see this JIRA as having several points of discussion:
> - Evaluation of what the current state of affairs is
> - Evaluate whether we currently provide any guarantees that aren't useful to
> users of the system (perhaps we can drop in exchange for performance)
> - Evaluate whether we are missing any guarantees that would be useful to
> users of the system
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.