[
https://issues.apache.org/jira/browse/HDFS-11740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996510#comment-15996510
]
Weiwei Yang edited comment on HDFS-11740 at 5/4/17 3:05 PM:
------------------------------------------------------------
Hi [~anu]
Thanks for your thoughtful comment, I appreciate it. Please see my answers below
Fixed Heartbeat - Pros:
bq. Simple to understand and write code. We are able to write good error
messages like this...
This doesn't change. I tested on my cluster, it still shows same message as it
is before.
bq. Fewer knobs to adjust – Since init, version and register are three states –
we are optimizing the first 90 seconds of a datanodes life. Since datanodes are
very long running processes, does this optimization matter?
I think it matters. There will be more states, if we let state transition
sleeps a fixed interval (which is now the interval for node heartbeat to SCM),
it might slow down the actual work. For example if in feature we want to
support decommission a datanode from SCM, once it is done, transited the state
to decommissioned. The decommission may take sometime and client is waiting on
that, probably won't be happy if it needs to wait for more 30s until state
changed. Right now is a good timing because there isn't many states, easy to
change.
bq. If that retry is happening, let us say one SCM is dead or network issue –
we don't want the scheduler to be running the next task immediately. We want
some quite period since this is an admin task – and we should not be consuming
too much resources. I am worried that RPC retry will happen till we time out
and then due to this
This is true. If a task has some failure happened, I can set the interval to
something else and ask scheduler to schedule next task after this time. This
can be done within current patch. I will show that in v3 patch.
bq. if you want to support this feature – may I suggest that we make changes in
DatanodeStates...
I tried this way, but it did not work out very well for me. The interval
setting is better to be in {{end point task}} level, because different tasks
may require different interval to run. Use {{ScheduledExecutorService}} as the
executor service will help the state machine to schedule tasks in required
interval if necessary, much more convenient than {{sleep}}. The behavior change
is like
Before the patch,
# Load the state task according to current datanode state
# Executes this state task
# Wait until the task returns, the result indicates the next state desired
# Transits to next state if necessary
# Sleep a fixed interval
# Back to 1 for next loop
After patch,
# Load the state task according to current datanode state
# Schedule the task to execute either immediately or some time later according
to the task interval
# Wait until the task returns, the result indicates the next state desired
# Transits to next state if necessary
# Back to 1 for next loop
Please let me know your thoughts.
Thanks
was (Author: cheersyang):
Hi [~anu]
Thanks for your thoughtful comment, I appreciate it. Please see my answers below
Fixed Heartbeat - Pros:
bq. Simple to understand and write code. We are able to write good error
messages like this...
This doesn't change. I tested on my cluster, it still shows same message as it
is before.
bq. Fewer knobs to adjust – Since init, version and register are three states –
we are optimizing the first 90 seconds of a datanodes life. Since datanodes are
very long running processes, does this optimization matter?
I think it matters. There will be more states, if we let state transition
sleeps a fixed interval (which is now the interval for node heartbeat to SCM),
it might slow down the actual work. For example if in feature we want to
support decommission a datanode from SCM, once it is done, transited the state
to decommissioned. The decommission may take sometime and client is waiting on
that, probably won't be happy if it needs to wait for more 30s until state
changed. Right now is a good timing because there isn't many states, easy to
change.
bq. If that retry is happening, let us say one SCM is dead or network issue –
we don't want the scheduler to be running the next task immediately. We want
some quite period since this is an admin task – and we should not be consuming
too much resources. I am worried that RPC retry will happen till we time out
and then due to this
This is true. If a task has some failure happened, I can set the interval to
something else and ask scheduler to schedule next task after this time. This
can be done within current patch. I will show that in v3 patch.
bq. if you want to support this feature – may I suggest that we make changes in
DatanodeStates...
I tried this way, but it did not work out very well for me. The interval
setting is better to be in {{end point task}} level, because different tasks
may require different interval to run. Use {{ScheduledExecutorService}} as the
executor service will help the state machine to schedule tasks in required
interval if necessary, much more convenient than {{sleep}}. The behavior change
is like
Before the patch,
# Load the state task according to current datanode state
# Executes this state task
# Wait until the task returns, the result indicates the next state desired
# Sleep a fixed interval
# Back to 1 for next loop
After patch,
# Load the state task according to current datanode state
# Schedule the task to execute either immediately or some time later according
to the task interval
# Wait until the task returns, the result indicates the next state desired
# Back to 1 for next loop
Please let me know your thoughts.
Thanks
> Ozone: Differentiate time interval for different DatanodeStateMachine state
> tasks
> ---------------------------------------------------------------------------------
>
> Key: HDFS-11740
> URL: https://issues.apache.org/jira/browse/HDFS-11740
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Reporter: Weiwei Yang
> Assignee: Weiwei Yang
> Attachments: HDFS-11740-HDFS-7240.001.patch,
> HDFS-11740-HDFS-7240.002.patch, statemachine_1.png, statemachine_2.png
>
>
> Currently datanode state machine transitioned between tasks in a fixed time
> interval, defined by {{ScmConfigKeys#OZONE_SCM_HEARTBEAT_INTERVAL_SECONDS}},
> the default value is 30s. Once datanode is started, it will need 90s before
> transited to {{Heartbeat}} state, such a long lag is not necessary. Propose
> to improve the logic of time interval handling, it seems only the heartbeat
> task needs to be scheduled in {{OZONE_SCM_HEARTBEAT_INTERVAL_SECONDS}}
> interval, rest should be done without any lagging.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]