[
https://issues.apache.org/jira/browse/HDFS-11740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993302#comment-15993302
]
Anu Engineer commented on HDFS-11740:
-------------------------------------
[~cheersyang] Thanks for the proposal. I am fine with having the heartbeat
handling keyed off from a state interval. But just to make sure that we are all
on the same page. The original 90 seconds is the time that datanode would use
to read the various containers and make sure that it is ready to communicate
with both SCM and the world.
Right now, we have some missing pieces, we need to launch a background thread
-- that does a directory scan for containers and volumes -- so that when SCM
asks for container reports we are ready.
So I am doubtful if we will gain anything by accelerating the initial boot
time(other than the case when you are testing, I set the HB to 1 seconds.
Perhaps you if want to add some more states, a differential time for each state
might be useful. If you want to do it, you can add interval to states class (it
is an internal class in state machine) and you can control it per state. Then
then main loop can read the next time from the state itself that will allow per
state time setting.
I would think it is not a very complex change and as I said I am ok with that,
but I doubt if we will be able speed up booting the datanode, since we need to
do some background sanity checks while booting up.
> Ozone: Differentiate time interval for different DatanodeStateMachine state
> tasks
> ---------------------------------------------------------------------------------
>
> Key: HDFS-11740
> URL: https://issues.apache.org/jira/browse/HDFS-11740
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Reporter: Weiwei Yang
> Assignee: Weiwei Yang
>
> Currently datanode state machine transitioned between tasks in a fixed time
> interval, defined by {{ScmConfigKeys#OZONE_SCM_HEARTBEAT_INTERVAL_SECONDS}},
> the default value is 30s. Once datanode is started, it will need 90s before
> transited to {{Heartbeat}} state, such a long lag is not necessary. Propose
> to improve the logic of time interval handling, it seems only the heartbeat
> task needs to be scheduled in {{OZONE_SCM_HEARTBEAT_INTERVAL_SECONDS}}
> interval, rest should be done without any lagging.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]