Hello HBase Community, We recently upgraded our HBase cluster from version 1.2.6 to 1.4.14 and have encountered an issue with replication lag in our Disaster Recovery (DR) cluster. We have two clusters in our setup: an active write cluster and a DR cluster that receives replication from the active cluster. The replication lag in the DR cluster has been building up, even though there are no direct writes to it.
Here's a brief overview of the problem: - We have an active write cluster with no replication lag. - The DR cluster only receives replication from the active cluster and doesn't have direct writes. - Replication lag builds up in the DR cluster over time, even though there is no active write. - When a 'put' call is made in the DR cluster, the replication lag reduces momentarily, but then starts building up . We have experienced similar kind of issue in 1.4.9 version in another cluster. We used the below patch for it. https://issues.apache.org/jira/browse/HBASE-22784 But 1.4.14 version contains above patch but still we experience issue. If there are any specific configurations or adjustments we should be making to address this problem. It's important for us to maintain a reliable DR setup, and any guidance or insights you can provide would be greatly appreciated. If anyone has experienced a similar issue after upgrading HBase or has any recommendations on how to troubleshoot and resolve replication lag in a DR cluster, please share your thoughts. Thank you in advance for your time and assistance. Your expertise and insights are invaluable to us as we work to resolve this issue and maintain the stability of our HBase setup. Best regards, Manimekalai K -- *Regards,* *Manimekalai K*