[jira] [Comment Edited] (HDDS-5331) Recon: Trigger MissingContainerTask when DN becomes stale

Devesh Kumar Singh (Jira) Thu, 24 Nov 2022 00:36:05 -0800


    [ 
https://issues.apache.org/jira/browse/HDDS-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638150#comment-17638150
 ]


Devesh Kumar Singh edited comment on HDDS-5331 at 11/24/22 8:35 AM:
--------------------------------------------------------------------

[~smeng] - To handle this , we need to have reference of recon task list inside 
StaleNodeHandler and DeadNodeHandler, so that we can trigger recon task from 
there. This will also need change in current way of starting recon tasks 
(ContainerHealthTask and PipeLineSyncTask) through some scheduled with fixed 
delay mechanism. Currently these tasks are triggered once and in while loop 
they wake up themselves after wait and then run again after a fixed interval, 
so no control of triggering these tasks from outside at our own will.

So my main concern here is - whether it is ok to have reference of recon task 
list inside StaleNodeHandler and DeadNodeHandler ?

 

Also StaleNodeHandler don't update container state or remove container replica 
from Recon cache on which ContainerHelathTask is dependent on identifying 
missing or unhealthy containers, so I don't see any use on triggering 
ContainerHealthTask on StaleNodeHandler. Yes we can trigger on DeadNodeHandler.

 

We should trigger PipelineSyncTask on StaleNodeHandler because we do close 
pipelines on StaleNodeHandler, so PipelineSyncTask will make sure to update 
pipeline related information.

cc: [~kkasawa] 


was (Author: JIRAUSER295411):
[~smeng] - To handle this , we need to have reference of recon task list inside 
StaleNodeHandler and DeadNodeHandler, so that we can trigger recon task from 
there. This will also need change in current way of starting recon tasks 
(ContainerHealthTask and PipeLineSyncTask) through some scheduled with fixed 
delay mechanism. Currently these tasks are triggered once and in while loop 
they wake up themselves after wait and then run again after a fixed interval, 
so no control of triggering these tasks from outside at our own will.

So my main concern here is - whether it is ok to have reference of recon task 
list inside StaleNodeHandler and DeadNodeHandler ?

cc: [~kkasawa] 

> Recon: Trigger MissingContainerTask when DN becomes stale
> ---------------------------------------------------------
>
>                 Key: HDDS-5331
>                 URL: https://issues.apache.org/jira/browse/HDDS-5331
>             Project: Apache Ozone
>          Issue Type: Task
>          Components: Ozone Recon
>            Reporter: Siyao Meng
>            Assignee: Devesh Kumar Singh
>            Priority: Major
>
> Currently, the ContainerHealthTask runs in periodic intervals to check if a 
> container is under replicated or missing. A small improvement by triggering a 
> run of this task when a DN goes STALE or DEAD can help identify under 
> replicated containers quicker.
> CC [~avijayan]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDDS-5331) Recon: Trigger MissingContainerTask when DN becomes stale

Reply via email to