liuguanghua created HDFS-17285:
----------------------------------
Summary: [RBF] Decrease dfsrouter safe mode check period.
Key: HDFS-17285
URL: https://issues.apache.org/jira/browse/HDFS-17285
Project: Hadoop HDFS
Issue Type: Improvement
Reporter: liuguanghua
When dfsrouter start, it enters safe mode. And it will cost 1min to leave.
The log is blow:
14:35:23,717 INFO
org.apache.hadoop.hdfs.server.federation.router.RouterSafemodeService: Leave
startup safe mode after 30000 ms
14:35:23,717 INFO
org.apache.hadoop.hdfs.server.federation.router.RouterSafemodeService: Enter
safe mode after 180000 ms without reaching the State Store
14:35:23,717 INFO
org.apache.hadoop.hdfs.server.federation.router.RouterSafemodeService: Entering
safe mode
14:35:24,996 INFO
org.apache.hadoop.hdfs.server.federation.router.RouterSafemodeService: Delaying
safemode exit for 28721 milliseconds...
14:36:25,037 INFO
org.apache.hadoop.hdfs.server.federation.router.RouterSafemodeService: Leaving
safe mode after 61319 milliseconds
It depends on these configs.
DFS_ROUTER_SAFEMODE_EXTENSION 30s
DFS_ROUTER_SAFEMODE_EXPIRATION 3min
DFS_ROUTER_CACHE_TIME_TO_LIVE_MS 1min (it is the period for check safe mode)
Because in safe mode dfsrouter will reject write requests, so it should be
shorter in check period if refreshCaches is done. And we should be separted
DFS_ROUTER_CACHE_TIME_TO_LIVE_MS form RouterSafemodeService.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]