[ https://issues.apache.org/jira/browse/HBASE-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835532#action_12835532 ]
Jean-Daniel Cryans commented on HBASE-2223: ------------------------------------------- Some design notes: We need another class to manage multiple ReplicationSources (going to many slave clusters) between ReplicationHLog and ReplicationSrouce, let's call it ReplicationSourceManager (RSM) for the moment. That class should be responsible to take actions and keep tabs for each outbound stream. When a source successfully sent a batch of edits to a peer, it should report the latest HLogKey to the RSM so that we match it to a HLog file (using the writeTime) and then publish that in Zookeeper for each slave cluster. We could detect that a peer is unreachable if the ReplicationSource didn't report after X time (configurable, not sure what should be the default). Here I'm still wondering what would be the best way to detect that a peer cluster is back... retrying connections to the peer ZK quorum? We also need to manage if the cluster is simply shut down (using the shutdown znode). At that point we stop queuing entries for that source and pile up all the Hlogs to process in a list in ZK. We also need a way here of telling the Master to not delete those logs. We should manage the fact that a hlog may be moved to the oldlogs directory so if the hlog isn't in the local log dir, it's probably in the other directory. When the cluster comes back, we process in order all HLogs without merging with the current flow of entries since we would now have 2 different set of HLogs to keep track of (we could improve this in the future). It's only when we reach the current HLog file that we flip the switch to take new entries. I expect that to be very tricky. Even trickier is keeping track of those HLogs when a RS dies on the master cluster. The pile of HLogs to process will still be in ZK along the latest HLogKey that was processed. It means we have to somehow hand off that processing to some or one RS. What I'm thinking is that the master, when done splitting logs, should hand that pile to a single RS which will open a new ReplicationSource and hopefully complete the replication. We can use the information published in ZK to learn the situation of each replication stream per peer and show that in a UI. > Handle 10min+ network partitions between clusters > ------------------------------------------------- > > Key: HBASE-2223 > URL: https://issues.apache.org/jira/browse/HBASE-2223 > Project: Hadoop HBase > Issue Type: Sub-task > Reporter: Jean-Daniel Cryans > Assignee: Jean-Daniel Cryans > Fix For: 0.21.0 > > > We need a nice way of handling long network partitions without impacting a > master cluster (which pushes the data). Currently it will just retry over and > over again. > I think we could: > - Stop replication to a slave cluster if it didn't respond for more than 10 > minutes > - Keep track of the duration of the partition > - When the slave cluster comes back, initiate a MR job like HBASE-2221 > Maybe we want less than 10 minutes, maybe we want this to be all automatic or > just the first 2 parts. Discuss. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.