[ 
https://issues.apache.org/jira/browse/HBASE-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923230#action_12923230
 ] 

Jonathan Gray commented on HBASE-3136:
--------------------------------------

Yes.  There are two ways to make this work.  The easy way is to just sync() 
before we start.  Or, as you describe, we retry after the update fails or if 
not in expected state we sync() and re-read.

To get something done today, I'm going to just add sync() calls at the start of 
the two CAS operations.  We can optimize later by being more optimistic and 
doing what you describe.

> Stale reads from ZK can break the atomic CAS operations we have in ZKAssign
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-3136
>                 URL: https://issues.apache.org/jira/browse/HBASE-3136
>             Project: HBase
>          Issue Type: Bug
>          Components: zookeeper
>    Affects Versions: 0.89.20100621, 0.89.20100924, 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.90.0
>
>
> With ZK based region transitions, we rely on atomic state changes of regions 
> in transition.  For example, an RS needs to atomically switch a node from 
> OFFLINE to OPENING, or the master needs to delete nodes that are in OPENED 
> state, etc...
> The way we implement this is by:
> - Read existing data (returns byte[] and version in Stat)
> - Verify data is in expected state
> - Update to the new state, passing the expected version previously read
> This doesn't always work as expected because that initial read of the 
> existing data could be a stale read (in ZK, writes are quorum writes but 
> reads are not so you can get stale data).
> Can provide a more explicit example if anyone is interested, but a fix is 
> coming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to