[GitHub] [hbase] wchevreuil commented on a diff in pull request #5241: HBASE-27871 Meta replication stuck forever if wal it's still reading gets rolled and deleted

via GitHub Tue, 23 May 2023 08:38:13 -0700


wchevreuil commented on code in PR #5241:
URL: https://github.com/apache/hbase/pull/5241#discussion_r1202563088



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestMetaRegionReplicaReplicationEndpoint.java:
##########
@@ -225,6 +227,38 @@ public void 
testCatalogReplicaReplicationWithFlushAndCompaction() throws Excepti
     }
   }
 
+  @Test
+  public void testCatalogReplicaReplicationWALRolledAndDeleted() throws 
Exception {
+    Connection connection = 
ConnectionFactory.createConnection(HTU.getConfiguration());
+    TableName tableName = TableName.valueOf("hbase:meta");
+    Table table = connection.getTable(tableName);
+    try {
+      MiniHBaseCluster cluster = HTU.getHBaseCluster();
+      HRegionServer hrs = 
cluster.getRegionServer(cluster.getServerHoldingMeta());
+      ReplicationSource source = (ReplicationSource) 
hrs.getReplicationSourceService()
+        .getReplicationManager().catalogReplicationSource.get();
+      ((ReplicationPeerImpl) source.replicationPeer).setPeerState(false);
+      // load the data to the table
+      for (int i = 0; i < 5; i++) {
+        LOG.info("Writing data from " + i * 1000 + " to " + (i * 1000 + 1000));
+        HTU.loadNumericRows(table, HConstants.CATALOG_FAMILY, i * 1000, i * 
1000 + 1000);
+        LOG.info("flushing table");
+        HTU.flush(tableName);
+        LOG.info("compacting table");
+        if (i < 4) {
+          HTU.compact(tableName, false);
+        }
+      }
+      
HTU.getHBaseCluster().getMaster().getLogCleaner().triggerCleanerNow().get(1,
+        TimeUnit.SECONDS);
+      ((ReplicationPeerImpl) source.replicationPeer).setPeerState(true);
+      verifyReplication(tableName, numOfMetaReplica, 0, 5000, 
HConstants.CATALOG_FAMILY);

Review Comment:
   `Here we just checked whether data can be read? I think the main thing here 
is that we should make sure the ReplicationSource can still relicate things 
out, i.e, it is not stuck forever.`
   
   It's what we are testing here. We disable the catalog peer before we do the 
flush and compact. When the replication is stuck forever because the FNFE, the 
updates from line #244 never get to the secondary replicas and the 
verifyReplication call on line #255 fails. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hbase] wchevreuil commented on a diff in pull request #5241: HBASE-27871 Meta replication stuck forever if wal it's still reading gets rolled and deleted

Reply via email to