[
https://issues.apache.org/jira/browse/HBASE-11765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099873#comment-14099873
]
Tianying Chang commented on HBASE-11765:
----------------------------------------
[~lhofhansl] Thanks for the link. It seems HBASE-8806 is trying to solve the
exactly same problem, but using a different approach. My way is to sort all the
kvs from all hlog entries. That way, it is able to guarantee for each batch()
call sent by replication sink, only one Put/Delete is created for a row, so no
lock problem. It fees a little like the approach taken by HBASE-6930. My patch
does not change the behavior of multi() in HRegion, only effect replication
sink implementation. With this change, a hlog that used to take 4min 20sec to
replay only need 30 sec. I will take a deeper look at HBASE-8806. Thanks.
> ReplicationSink should merge the Put/Delete of the same row into one Action
> even if they are from different hlog entry.
> -----------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-11765
> URL: https://issues.apache.org/jira/browse/HBASE-11765
> Project: HBase
> Issue Type: Improvement
> Components: Performance, Replication
> Affects Versions: 0.94.7
> Reporter: Tianying Chang
> Assignee: Tianying Chang
> Fix For: 0.94.23
>
> Attachments: HBASE-11765.patch
>
>
> The current replicationSink code make sure it will only create one Put/Delete
> action of the kv of same row if it is from same hlog entry. However, when the
> same row of Put/Delete exist in different hlog entry, multiple Put/Delete
> action will be created, this will cause synchronization cost during the multi
> batch operation.
> In one of our application traffic pattern which has delete for same row twice
> for many rows, we saw doMiniBatchMutation() is invoked many times due to the
> row lock for the same row. ReplicationSink side is super slow, and
> replication queue build up.
> We should put the put/delete for the same row into one Put/Delete action even
> if they are from different hlog entry.
--
This message was sent by Atlassian JIRA
(v6.2#6252)