[ 
https://issues.apache.org/jira/browse/HBASE-13639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947274#comment-14947274
 ] 

Dave Latham commented on HBASE-13639:
-------------------------------------

Definitely challenging to use under significant write traffic.  If it is not 
overwriting old data (updates causing old versions to expire as opposed to just 
inserts), then you can run it for a data time range to exclude the ongoing 
inserts.

> SyncTable - rsync for HBase tables
> ----------------------------------
>
>                 Key: HBASE-13639
>                 URL: https://issues.apache.org/jira/browse/HBASE-13639
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Dave Latham
>            Assignee: Dave Latham
>             Fix For: 2.0.0, 0.98.14, 1.2.0
>
>         Attachments: HBASE-13639-0.98-addendum-hadoop-1.patch, 
> HBASE-13639-0.98.patch, HBASE-13639-v1.patch, HBASE-13639-v2.patch, 
> HBASE-13639-v3-0.98.patch, HBASE-13639-v3.patch, HBASE-13639.patch
>
>
> Given HBase tables in remote clusters with similar but not identical data, 
> efficiently update a target table such that the data in question is identical 
> to a source table.  Efficiency in this context means using far less network 
> traffic than would be required to ship all the data from one cluster to the 
> other.  Takes inspiration from rsync.
> Design doc: 
> https://docs.google.com/document/d/1-2c9kJEWNrXf5V4q_wBcoIXfdchN7Pxvxv1IO6PW0-U/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to