[
https://issues.apache.org/jira/browse/HDFS-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17677866#comment-17677866
]
ASF GitHub Bot commented on HDFS-16890:
---------------------------------------
goiri commented on code in PR #5298:
URL: https://github.com/apache/hadoop/pull/5298#discussion_r1072549013
##########
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestObserverWithRouter.java:
##########
@@ -639,4 +643,36 @@ public void testRouterStateIdContextCleanup() throws
Exception {
assertEquals("ns0", namespace1.get(0));
assertTrue(namespace2.isEmpty());
}
+
+ @Test
+ @Tag(SKIP_BEFORE_EACH_CLUSTER_STARTUP)
+ public void testPeriodicStateRefreshUsingActiveNamenode() throws Exception {
+ Path rootPath = new Path("/");
+
+ Configuration confOverride = new Configuration(false);
+
confOverride.set(RBFConfigKeys.DFS_ROUTER_OBSERVER_STATE_ID_REFRESH_PERIOD_KEY,
"500ms");
+ confOverride.set(DFSConfigKeys.DFS_HA_TAILEDITS_PERIOD_KEY, "3s");
+ startUpCluster(1, confOverride);
+
+ fileSystem = routerContext.getFileSystem(getConfToEnableObserverReads());
+ fileSystem.listStatus(rootPath);
+ int initialLengthOfRootListing = fileSystem.listStatus(rootPath).length;
+
+ DFSClient activeClient = cluster.getNamenodes("ns0")
+ .stream()
+ .filter(nnContext -> nnContext.getNamenode().isActiveState())
+ .findFirst().orElseThrow(() -> new IllegalStateException("No active
namenode."))
+ .getClient();
+
+ for (int i = 0; i < 10; i++) {
+ activeClient.mkdirs("/dir" + i, null, false);
+ }
+ activeClient.close();
+
+ // Wait long enough for state in router to be considered stale.
+ Thread.sleep(700);
Review Comment:
GenericTestUtils#waitFor
> RBF: Add period state refresh to keep router state near active namenode's
> -------------------------------------------------------------------------
>
> Key: HDFS-16890
> URL: https://issues.apache.org/jira/browse/HDFS-16890
> Project: Hadoop HDFS
> Issue Type: Task
> Reporter: Simbarashe Dzinamarira
> Assignee: Simbarashe Dzinamarira
> Priority: Major
> Labels: pull-request-available
>
> When using the ObserverReadProxyProvider, clients can setÂ
> *dfs.client.failover.observer.auto-msync-period...* to periodically get the
> Active namenode's state. When using routers without the
> ObserverReadProxyProvider, this periodic update is lost.
> In a busy cluster, the Router constantly gets updated with the active
> namenode's state when
> # There is a write operation.
> # There is an operation (read/write) from a new clients.
> However, in the scenario when there are no new clients and no write
> operations, the state kept in the router can lag behind the active's. The
> router does update its state with responses from the Observer, but the
> observer may be lagging behind too.
> We should have a periodic refresh in the router to serve a similar role as
> *dfs.client.failover.observer.auto-msync-period*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]