[jira] [Commented] (HBASE-13153) enable bulkload to support replication

Ashish Singhi (JIRA) Thu, 27 Aug 2015 04:35:39 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716500#comment-14716500
 ]


Ashish Singhi commented on HBASE-13153:
---------------------------------------

Thanks for the comments, Ted.
bq. Can you describe how the max request size limit would be monitored ?
First pardon me, it should be queue size. when ever there is a request from 
ReplicationSource to ship the hfiles, the end point will send a RPC request to 
peer RS only if the user configured time interval or queue size limit is 
crossed. This is basically to avoid too many RPC requests to the peer RS.

bq. HFile paths are in ZK. Do we need to send the paths in RPC ?
Yes, peer cluster will not have access to the source cluster ZK.

bq. The response can be sent before HFile splitting is completed, right ?
Can you tell me at what point of time this splitting is taking place during 
replication process ?

bq. Could there be collision between HFile names ?
AFAIK hfile names(uuid) are unique, no ?

bq. This constraint is due to the limit on amount of data that can be stored in 
ZK. Have you thought of introducing a system table for recording information 
w.r.t. HFiles to be replicated ?
Not here. This constraint will go off once when we have ZK less replication 
(HBASE-10295). Currently the same will hold for wal edits too.

bq. Should visibility labels be rewritten during the replication ?
I think we should not do this in our replication code. IMO we should avoid 
system table data replication or any modification to it.

> enable bulkload to support replication
> --------------------------------------
>
>                 Key: HBASE-13153
>                 URL: https://issues.apache.org/jira/browse/HBASE-13153
>             Project: HBase
>          Issue Type: Bug
>          Components: API
>            Reporter: sunhaitao
>            Assignee: Ashish Singhi
>         Attachments: HBase Bulk Load Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13153) enable bulkload to support replication

Reply via email to