[
https://issues.apache.org/jira/browse/HDFS-17166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759196#comment-17759196
]
ASF GitHub Bot commented on HDFS-17166:
---------------------------------------
KeeProMise commented on code in PR #5990:
URL: https://github.com/apache/hadoop/pull/5990#discussion_r1306302466
##########
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/resolver/MembershipNamenodeResolver.java:
##########
@@ -83,14 +83,12 @@ public class MembershipNamenodeResolver
/** Cached lookup of NN for block pool. Invalidated on cache refresh. */
private Map<String, List<? extends FederationNamenodeContext>> cacheBP;
-
Review Comment:
done
##########
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterClientRejectOverload.java:
##########
@@ -357,6 +363,72 @@ public void testNoNamenodesAvailable() throws Exception{
assertEquals(originalRouter0Failures, rpcMetrics0.getProxyOpNoNamenodes());
}
+ /**
+ * When failover occurs, the router may record that the ns has no active
namenode.
+ * Only when the router updates the cache next time can the memory status be
updated,
+ * causing the router to report NoNamenodesAvailableException for a long time
+ */
+ @Test
+ public void testNoNamenodesAvailableLongTimeWhenNsFailover() throws
Exception {
+ setupCluster(false, true);
+ transitionClusterNSToStandby(cluster);
+ for (RouterContext routerContext : cluster.getRouters()) {
+ // Manually trigger the heartbeat
+ Collection<NamenodeHeartbeatService> heartbeatServices = routerContext
+ .getRouter().getNamenodeHeartbeatServices();
+ for (NamenodeHeartbeatService service : heartbeatServices) {
+ service.periodicInvoke();
+ }
+ // Update service cache
+ routerContext.getRouter().getStateStore().refreshCaches(true);
+ }
+ // Record the time after the router first updated the cache
+ long firstLoadTime = Time.now();
+ List<MiniRouterDFSCluster.NamenodeContext> namenodes =
cluster.getNamenodes();
+
+ // Make sure all namenodes are in standby state
+ for (MiniRouterDFSCluster.NamenodeContext namenodeContext : namenodes) {
+ assertTrue(namenodeContext.getNamenode().getNameNodeState() ==
STANDBY.ordinal());
+ }
+
+ Configuration conf = cluster.getRouterClientConf();
+ // Set dfs.client.failover.random.order false, to pick 1st router at first
+ conf.setBoolean("dfs.client.failover.random.order", false);
+
+ DFSClient routerClient = new DFSClient(new URI("hdfs://fed"), conf);
+
+ for (RouterContext routerContext : cluster.getRouters()) {
+ // Get the second namenode in the router cache and make it active
+ List<? extends FederationNamenodeContext> ns0 =
routerContext.getRouter().getNamenodeResolver().getNamenodesForNameserviceId("ns0",
false);
Review Comment:
done
##########
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterClientRejectOverload.java:
##########
@@ -357,6 +363,72 @@ public void testNoNamenodesAvailable() throws Exception{
assertEquals(originalRouter0Failures, rpcMetrics0.getProxyOpNoNamenodes());
}
+ /**
+ * When failover occurs, the router may record that the ns has no active
namenode.
+ * Only when the router updates the cache next time can the memory status be
updated,
+ * causing the router to report NoNamenodesAvailableException for a long time
+ */
+ @Test
+ public void testNoNamenodesAvailableLongTimeWhenNsFailover() throws
Exception {
+ setupCluster(false, true);
+ transitionClusterNSToStandby(cluster);
+ for (RouterContext routerContext : cluster.getRouters()) {
+ // Manually trigger the heartbeat
+ Collection<NamenodeHeartbeatService> heartbeatServices = routerContext
+ .getRouter().getNamenodeHeartbeatServices();
+ for (NamenodeHeartbeatService service : heartbeatServices) {
+ service.periodicInvoke();
+ }
+ // Update service cache
+ routerContext.getRouter().getStateStore().refreshCaches(true);
+ }
+ // Record the time after the router first updated the cache
+ long firstLoadTime = Time.now();
+ List<MiniRouterDFSCluster.NamenodeContext> namenodes =
cluster.getNamenodes();
+
+ // Make sure all namenodes are in standby state
+ for (MiniRouterDFSCluster.NamenodeContext namenodeContext : namenodes) {
+ assertTrue(namenodeContext.getNamenode().getNameNodeState() ==
STANDBY.ordinal());
+ }
+
+ Configuration conf = cluster.getRouterClientConf();
+ // Set dfs.client.failover.random.order false, to pick 1st router at first
+ conf.setBoolean("dfs.client.failover.random.order", false);
+
+ DFSClient routerClient = new DFSClient(new URI("hdfs://fed"), conf);
+
+ for (RouterContext routerContext : cluster.getRouters()) {
+ // Get the second namenode in the router cache and make it active
+ List<? extends FederationNamenodeContext> ns0 =
routerContext.getRouter().getNamenodeResolver().getNamenodesForNameserviceId("ns0",
false);
+ String nsId = ns0.get(1).getNamenodeId();
+ cluster.switchToActive("ns0", nsId);
+ // Manually trigger the heartbeat, but the router does not manually load
the cache
+ Collection<NamenodeHeartbeatService> heartbeatServices = routerContext
+ .getRouter().getNamenodeHeartbeatServices();
+ for (NamenodeHeartbeatService service : heartbeatServices) {
+ service.periodicInvoke();
+ }
+ assertTrue(cluster.getNamenode("ns0",
nsId).getNamenode().getNameNodeState() == ACTIVE.ordinal());
Review Comment:
done
> RBF: Throwing NoNamenodesAvailableException for a long time, when failover
> --------------------------------------------------------------------------
>
> Key: HDFS-17166
> URL: https://issues.apache.org/jira/browse/HDFS-17166
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Jian Zhang
> Priority: Major
> Labels: pull-request-available
> Attachments:
> fix_NoNamenodesAvailableException_long_time_when_ns_failover.patch,
> image-2023-08-26-00-24-02-016.png, image-2023-08-26-00-25-42-086.png
>
>
> When ns failover, the router may record that the ns have no active namenode,
> the router cannot find the active nn in the ns for about 1 minute. The client
> will report an error after consuming the number of retries, and the router
> will be unable to provide services for the ns for a long time.
> 11:52:44 Start reporting
> !image-2023-08-26-00-24-02-016.png|width=800,height=100!
> 11:53:46 end reporting
> !image-2023-08-26-00-25-42-086.png|width=800,height=50!
>
> At this point, the failover has been successfully completed in the ns, and
> the client can directly connect to the active namenode to access it
> successfully, but the client cannot access the ns through router for up to a
> minute
>
> *There is a bug in this logic:*
> * A certain ns starts to fail over,
> * There is a state where there is no active nn in ns, Router reports the
> status (no active nn) to the state store
> * After a period of time, the router pulls the state store data to update the
> cache, and the cache records that the ns has no active nn
> * Failover successfully completed, at which point the ns actually has an
> active nn
> * Assuming it's not time for router to update the cache yet
> * The client sent a request to the router for the ns, and the router
> accessed the first nn of the ns in the router’s cache (no active nn)
> * Unfortunately, the nn is really standby, so the request went wrong and
> entered the exception handling logic. The router found that there is no
> active nn for the ns in the cache and throw NoNamenodesAvailableException
> * The NoNamenodesAvailableException exception is wrapped as a
> RetrieveException, which causes the client to retry. Since each router
> retrieves the true standby nn in the cache (because it is always the first
> one in the cache and has a high priority), a NoNamenodesAvailableException is
> thrown every time until the router updates the cache from the state store
>
> *Fix the bug*
> When an ns in the router's cache does not have an active nn, but in reality,
> the ns has an active nn, and the client requests to throw a
> NoNamenodesAvailableException, it is proven that the requested nn is a real
> standby nn. The priority of this nn should be lowered so that the next
> request will find the real active nn, avoiding constantly requesting the real
> standby nn, which will cause the cache to be updated before the next time,
> The router is unable to provide services for the ns to the client.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]