[jira] [Comment Edited] (HDFS-11740) Ozone: Differentiate time interval for different DatanodeStateMachine state tasks

Weiwei Yang (JIRA) Thu, 04 May 2017 08:06:26 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-11740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996510#comment-15996510
 ]


Weiwei Yang edited comment on HDFS-11740 at 5/4/17 3:05 PM:
------------------------------------------------------------

Hi [~anu]

Thanks for your thoughtful comment, I appreciate it. Please see my answers below

Fixed Heartbeat - Pros:

bq. Simple to understand and write code. We are able to write good error 
messages like this...

This doesn't change. I tested on my cluster, it still shows same message as it 
is before.

bq. Fewer knobs to adjust – Since init, version and register are three states – 
we are optimizing the first 90 seconds of a datanodes life. Since datanodes are 
very long running processes, does this optimization matter?

I think it matters. There will be more states, if we let state transition 
sleeps a fixed interval (which is now the interval for node heartbeat to SCM), 
it might slow down the actual work. For example if in feature we want to 
support decommission a datanode from SCM, once it is done, transited the state 
to decommissioned. The decommission may take sometime and client is waiting on 
that, probably won't be happy if it needs to wait for more 30s until state 
changed. Right now is a good timing because there isn't many states, easy to 
change.

bq. If that retry is happening, let us say one SCM is dead or network issue – 
we don't want the scheduler to be running the next task immediately. We want 
some quite period since this is an admin task – and we should not be consuming 
too much resources. I am worried that RPC retry will happen till we time out 
and then due to this

This is true. If a task has some failure happened, I can set the interval to 
something else and ask scheduler to schedule next task after this time. This 
can be done within current patch. I will show that in v3 patch.

bq. if you want to support this feature – may I suggest that we make changes in 
DatanodeStates...

I tried this way, but it did not work out very well for me. The interval 
setting is better to be in {{end point task}} level, because different tasks 
may require different interval to run. Use {{ScheduledExecutorService}} as the 
executor service will help the state machine to schedule tasks in required 
interval if necessary, much more convenient than {{sleep}}. The behavior change 
is like

Before the patch,
# Load the state task according to current datanode state
# Executes this state task
# Wait until the task returns, the result indicates the next state desired
# Transits to next state if necessary
# Sleep a fixed interval
# Back to 1 for next loop 

After patch,
# Load the state task according to current datanode state
# Schedule the task to execute either immediately or some time later according 
to the task interval
# Wait until the task returns, the result indicates the next state desired
# Transits to next state if necessary
# Back to 1 for next loop

Please let me know your thoughts.

Thanks


was (Author: cheersyang):
Hi [~anu]

Thanks for your thoughtful comment, I appreciate it. Please see my answers below

Fixed Heartbeat - Pros:

bq. Simple to understand and write code. We are able to write good error 
messages like this...

This doesn't change. I tested on my cluster, it still shows same message as it 
is before.

bq. Fewer knobs to adjust – Since init, version and register are three states – 
we are optimizing the first 90 seconds of a datanodes life. Since datanodes are 
very long running processes, does this optimization matter?

I think it matters. There will be more states, if we let state transition 
sleeps a fixed interval (which is now the interval for node heartbeat to SCM), 
it might slow down the actual work. For example if in feature we want to 
support decommission a datanode from SCM, once it is done, transited the state 
to decommissioned. The decommission may take sometime and client is waiting on 
that, probably won't be happy if it needs to wait for more 30s until state 
changed. Right now is a good timing because there isn't many states, easy to 
change.

bq. If that retry is happening, let us say one SCM is dead or network issue – 
we don't want the scheduler to be running the next task immediately. We want 
some quite period since this is an admin task – and we should not be consuming 
too much resources. I am worried that RPC retry will happen till we time out 
and then due to this

This is true. If a task has some failure happened, I can set the interval to 
something else and ask scheduler to schedule next task after this time. This 
can be done within current patch. I will show that in v3 patch.

bq. if you want to support this feature – may I suggest that we make changes in 
DatanodeStates...

I tried this way, but it did not work out very well for me. The interval 
setting is better to be in {{end point task}} level, because different tasks 
may require different interval to run. Use {{ScheduledExecutorService}} as the 
executor service will help the state machine to schedule tasks in required 
interval if necessary, much more convenient than {{sleep}}. The behavior change 
is like

Before the patch,
# Load the state task according to current datanode state
# Executes this state task
# Wait until the task returns, the result indicates the next state desired
# Sleep a fixed interval
# Back to 1 for next loop 

After patch,
# Load the state task according to current datanode state
# Schedule the task to execute either immediately or some time later according 
to the task interval
# Wait until the task returns, the result indicates the next state desired
# Back to 1 for next loop

Please let me know your thoughts.

Thanks

> Ozone: Differentiate time interval for different DatanodeStateMachine state 
> tasks
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-11740
>                 URL: https://issues.apache.org/jira/browse/HDFS-11740
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Weiwei Yang
>            Assignee: Weiwei Yang
>         Attachments: HDFS-11740-HDFS-7240.001.patch, 
> HDFS-11740-HDFS-7240.002.patch, statemachine_1.png, statemachine_2.png
>
>
> Currently datanode state machine transitioned between tasks in a fixed time 
> interval, defined by {{ScmConfigKeys#OZONE_SCM_HEARTBEAT_INTERVAL_SECONDS}}, 
> the default value is 30s. Once datanode is started, it will need 90s before 
> transited to {{Heartbeat}} state, such a long lag is not necessary. Propose 
> to improve the logic of time interval handling, it seems only the heartbeat 
> task needs to be scheduled in {{OZONE_SCM_HEARTBEAT_INTERVAL_SECONDS}} 
> interval, rest should be done without any lagging.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDFS-11740) Ozone: Differentiate time interval for different DatanodeStateMachine state tasks

Reply via email to