Hi everyone, I'm writing this request to propose a merge of HBASE-26913[1] "Replication Observability Framework" to master and branch-2. The goal is to persist replication related metadata to newly created hbase tables to reduce MTTD (Mean Time To Detect) and MTTR (Mean Time To Repair) replication inconsistencies between Primary and DR clusters. The design doc is here[2].
We have created 2 new hbase tables, REPLICATION.WALEVENTTRACKER and REPLICATION.SINK_TRACKER. The first table will store all the WAL events (ACTIVE, ROLLING, ROLLED) along with metadata (wal name, wal length, region server name, timestamp) from all region servers. We have introduced a new chore called ReplicationMarkerChore which will create special marker/sentinel rows periodically (which is configurable) and inject them directly to WAL. This marker rows will be handled specially and replicated to sink cluster and will be persisted to REPLICATION.SINK_TRACKER table. Highlights * The entire feature is configurable. Defaults to false. * A new section is added to the HBase book which covers the feature and how to use it. The vote will open for at least 72 hours. Please vote: [+1] Merge the changes from HBASE-26913 to master/branch-2 [+/-0] Neutral [-1] Disagree (please include actionable feedback) 1. https://issues.apache.org/jira/browse/HBASE-26913 2. https://docs.google.com/document/d/14oZ5ssY28hvJaQD_Jg9kWX7LfUKUyyU2PCA93PPzVko/edit#heading=h.9oum2kn0zj5r Thanks, Rushabh
