[ 
https://issues.apache.org/jira/browse/HBASE-14791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15001464#comment-15001464
 ] 

Alex Araujo commented on HBASE-14791:
-------------------------------------

Attached a v1 that does size based buffering for Deletes, similar to how Puts 
are buffered. It does not change 0.98 HTable semantics, and is disabled by 
default as [~apurtell] suggested.

I agree that two buffering mechanisms is not ideal. If correctness is an issue 
and/or we want to avoid separate buffers, we could try to use the Put buffering 
in HTable for both and disable it for Deletes by default. That would align 0.98 
buffering more closely with BufferedMutator in 1.0+, but would also be more 
invasive.

> [0.98] CopyTable is extremely slow when moving delete markers
> -------------------------------------------------------------
>
>                 Key: HBASE-14791
>                 URL: https://issues.apache.org/jira/browse/HBASE-14791
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.16
>            Reporter: Lars Hofhansl
>            Assignee: Alex Araujo
>         Attachments: HBASE-14791-0.98-v1.patch
>
>
> We found that some of our copy table job run for many hours, even when there 
> isn't that much data to copy.
> [~vik.karma] did his magic and found that the issue is with copying delete 
> markers (we use raw mode to also move deletes across).
> Looking at the code in 0.98 it's immediately obvious that deletes (unlike 
> puts) are not batched and hence sent to the other side one by one, causing a 
> network RTT for each delete marker.
> Looks like in trunk it's doing the right thing (using BufferedMutators for 
> all mutations in TableOutputFormat). So likely only a 0.98 (and 1.0, 1.1, 
> 1.2?) issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to