[ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14728572#comment-14728572
 ] 

Bhupendra Kumar Jain commented on HBASE-13153:
----------------------------------------------

Having cluster id as part of hfile meta data is really nice to have. This meta 
data can clearly indicate the cluster source. 

But during replication, with this appraoch,the cluster id needs to be added to 
each hfile meta block. This will require rewriting of each hfile meta block, so 
we think this will slow down the replication process compare to writing cluster 
id in zk node.  

Also during replication process, when replication end point detect the cycle, 
it needs to refer to hfile meta data. Consider the case where hfile is in 
archive, So I think there is no meta information available for archive file in 
cache. This may take more time too. Please correct if I got it wrong ?

> enable bulkload to support replication
> --------------------------------------
>
>                 Key: HBASE-13153
>                 URL: https://issues.apache.org/jira/browse/HBASE-13153
>             Project: HBase
>          Issue Type: New Feature
>          Components: Replication
>            Reporter: sunhaitao
>            Assignee: Ashish Singhi
>             Fix For: 2.0.0
>
>         Attachments: HBase Bulk Load Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to