[
https://issues.apache.org/jira/browse/HBASE-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923239#action_12923239
]
HBase Review Board commented on HBASE-3136:
-------------------------------------------
Message from: "Jonathan Gray" <[email protected]>
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1056/
-----------------------------------------------------------
Review request for hbase, Todd Lipcon and stack.
Summary
-------
Adds a sync(path) operation into ZKW and three calls into it from the CAS
operations in ZKAssign.
This addresses bug HBASE-3136.
http://issues.apache.org/jira/browse/HBASE-3136
Diffs
-----
trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 1025790
trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java
1025790
Diff: http://review.cloudera.org/r/1056/diff
Testing
-------
Still need to test more. I'm not sure it's possible (or feasible in a
reasonable amount of time) to make a unit test for this. We'd probably need to
dig into ZK or mock the hell out of stuff.
Thanks,
Jonathan
> Stale reads from ZK can break the atomic CAS operations we have in ZKAssign
> ---------------------------------------------------------------------------
>
> Key: HBASE-3136
> URL: https://issues.apache.org/jira/browse/HBASE-3136
> Project: HBase
> Issue Type: Bug
> Components: zookeeper
> Affects Versions: 0.89.20100621, 0.89.20100924, 0.90.0
> Reporter: Jonathan Gray
> Assignee: Jonathan Gray
> Priority: Blocker
> Fix For: 0.90.0
>
>
> With ZK based region transitions, we rely on atomic state changes of regions
> in transition. For example, an RS needs to atomically switch a node from
> OFFLINE to OPENING, or the master needs to delete nodes that are in OPENED
> state, etc...
> The way we implement this is by:
> - Read existing data (returns byte[] and version in Stat)
> - Verify data is in expected state
> - Update to the new state, passing the expected version previously read
> This doesn't always work as expected because that initial read of the
> existing data could be a stale read (in ZK, writes are quorum writes but
> reads are not so you can get stale data).
> Can provide a more explicit example if anyone is interested, but a fix is
> coming.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.