[ 
https://issues.apache.org/jira/browse/HBASE-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835532#action_12835532
 ] 

Jean-Daniel Cryans commented on HBASE-2223:
-------------------------------------------

Some design notes:

We need another class to manage multiple ReplicationSources (going to many 
slave clusters) between ReplicationHLog and ReplicationSrouce, let's call it 
ReplicationSourceManager (RSM) for the moment. That class should be responsible 
to take actions and keep tabs for each outbound stream. When a source 
successfully sent a batch of edits to a peer, it should report the latest 
HLogKey to the RSM so that we match it to a HLog file (using the writeTime) and 
then publish that in Zookeeper for each slave cluster.

We could detect that a peer is unreachable if the ReplicationSource didn't 
report after X time (configurable, not sure what should be the default). Here 
I'm still wondering what would be the best way to detect that a peer cluster is 
back... retrying connections to the peer ZK quorum? We also need to manage if 
the cluster is simply shut down (using the shutdown znode). At that point we 
stop queuing entries for that source and pile up all the Hlogs to process in a 
list in ZK. We also need a way here of telling the Master to not delete those 
logs. We should manage the fact that a hlog may be moved to the oldlogs 
directory so if the hlog isn't in the local log dir, it's probably in the other 
directory.

When the cluster comes back, we process in order all HLogs without merging with 
the current flow of entries since we would now have 2 different set of HLogs to 
keep track of (we could improve this in the future). It's only when we reach 
the current HLog file that we flip the switch to take new entries. I expect 
that to be very tricky.

Even trickier is keeping track of those HLogs when a RS dies on the master 
cluster. The pile of HLogs to process will still be in ZK along the latest 
HLogKey that was processed. It means we have to somehow hand off that 
processing to some or one RS. What I'm thinking is that the master, when done 
splitting logs, should hand that pile to a single RS which will open a new 
ReplicationSource and hopefully complete the replication.

We can use the information published in ZK to learn the situation of each 
replication stream per peer and show that in a UI.

> Handle 10min+ network partitions between clusters
> -------------------------------------------------
>
>                 Key: HBASE-2223
>                 URL: https://issues.apache.org/jira/browse/HBASE-2223
>             Project: Hadoop HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> We need a nice way of handling long network partitions without impacting a 
> master cluster (which pushes the data). Currently it will just retry over and 
> over again.
> I think we could:
>  - Stop replication to a slave cluster if it didn't respond for more than 10 
> minutes
>  - Keep track of the duration of the partition
>  - When the slave cluster comes back, initiate a MR job like HBASE-2221 
> Maybe we want less than 10 minutes, maybe we want this to be all automatic or 
> just the first 2 parts. Discuss.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to