[jira] [Commented] (HBASE-14460) [Perf Regression] Merge of MVCC and SequenceId (HBASE-HBASE-8763) slowed Increments, CheckAndPuts, batch operations

stack (JIRA) Sat, 26 Sep 2015 11:29:52 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-14460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14909404#comment-14909404
 ]


stack commented on HBASE-14460:
-------------------------------

bq. After we're done this I wonder if we can circle back to an argument from 
the OP on this issue...

HBASE-3434

See also this old article on distributed counters with CRDT: 
https://queue.acm.org/detail.cfm?id=2462076. Relevant piece quoted below.

"...consider building an increment-only counter that is replicated on two 
servers. We might implement the increment operation by first reading the 
counter's value on one replica, incrementing the value by one, and writing the 
new value back on every replica. If the counter is initially at 0 and two 
different users simultaneously initiate increment operations on separate 
servers, both users may read 0 and then distribute the value 1 to the replicas; 
the counter ends up with a value of 1 instead of the correct value of 2. 
Instead, we can use a G-counter CRDT, which relies on the fact that increment 
is a commutative operation—it doesn't matter in what order the two increment 
operations are applied, as long as they are both eventually applied at all 
sites. With a G-counter, the current counter status is represented as the count 
of distinct increment invocations, similar to how counting is introduced at the 
grade-school level: by making a tally mark for every increment then summing the 
total. In our example, instead of reading and writing counter values, each 
invocation distributes an increment operation. All replicas end up with two 
increment operations, which sum to the correct value of 2. This works because 
the replicas understand the semantics of increment operations instead of 
providing general-purpose read/write operations, which are not commutative."

> [Perf Regression] Merge of MVCC and SequenceId (HBASE-HBASE-8763) slowed 
> Increments, CheckAndPuts, batch operations
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-14460
>                 URL: https://issues.apache.org/jira/browse/HBASE-14460
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>         Attachments: 14460.txt, region_lock.png
>
>
> As reported by 鈴木俊裕 up on the mailing list -- see "Performance degradation 
> between CDH5.3.1(HBase0.98.6) and CDH5.4.5(HBase1.0.0)" -- our unification of 
> sequenceid and MVCC slows Increments (and other ops) as the mvcc needs to 
> 'catch up' to our current point before we can read the last Increment value 
> that we need to update.
> We can say that our Increment is just done wrong, we should just be writing 
> Increments and summing on read, but checkAndPut as well as batching 
> operations have the same issue. Fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14460) [Perf Regression] Merge of MVCC and SequenceId (HBASE-HBASE-8763) slowed Increments, CheckAndPuts, batch operations

Reply via email to