[jira] Commented: (HBASE-3136) Stale reads from ZK can break the atomic CAS operations we have in ZKAssign

HBase Review Board (JIRA) Wed, 20 Oct 2010 17:27:49 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923257#action_12923257
 ]


HBase Review Board commented on HBASE-3136:
-------------------------------------------

Message from: "Todd Lipcon" <[email protected]>


bq.  On 2010-10-20 16:49:52, Todd Lipcon wrote:
bq.  > seems OK, but we're adding couple extra ms of latency here on all of 
these calls. Is that going to be expensive for assigning lots of regions?
bq.  > It seems we should be optimistic, and only really need to sync if we see 
unexpected state or the checked put fails?
bq.  
bq.  Jonathan Gray wrote:
bq.      Yeah, gave that a quick shot.  It's not easy (the code gets messy 
quick so it needs to be well thought out).
bq.      
bq.      I'd like to commit this and we can open another jira to deal with the 
optimistic approach.

OK, that sounds good by me. We'll fix the perf issue in a follow-up.

Would still like Stack to review, I don't know the master code well enough to 
know if any other places might be missing, but I agree in concept :)


- Todd


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1056/#review1590
-----------------------------------------------------------





> Stale reads from ZK can break the atomic CAS operations we have in ZKAssign
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-3136
>                 URL: https://issues.apache.org/jira/browse/HBASE-3136
>             Project: HBase
>          Issue Type: Bug
>          Components: zookeeper
>    Affects Versions: 0.89.20100621, 0.89.20100924, 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.90.0
>
>
> With ZK based region transitions, we rely on atomic state changes of regions 
> in transition.  For example, an RS needs to atomically switch a node from 
> OFFLINE to OPENING, or the master needs to delete nodes that are in OPENED 
> state, etc...
> The way we implement this is by:
> - Read existing data (returns byte[] and version in Stat)
> - Verify data is in expected state
> - Update to the new state, passing the expected version previously read
> This doesn't always work as expected because that initial read of the 
> existing data could be a stale read (in ZK, writes are quorum writes but 
> reads are not so you can get stale data).
> Can provide a more explicit example if anyone is interested, but a fix is 
> coming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3136) Stale reads from ZK can break the atomic CAS operations we have in ZKAssign

Reply via email to