[
https://issues.apache.org/jira/browse/HDDS-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640994#comment-17640994
]
Siyao Meng commented on HDDS-5331:
----------------------------------
[~deveshsingh]
bq. we need to have reference of recon task list inside StaleNodeHandler and
DeadNodeHandler
You mean StaleNodeHandler and *Recon*DeadNodeHandler right? They are
[initialized in
Facade|https://github.com/apache/ozone/blob/b46f961b59a994ccab89a7806d815e53e5def8c0/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/scm/ReconStorageContainerManagerFacade.java#L200-L203].
ReconDeadNodeHandler overrides onMessage to do something else. And Recon would
not be able to close the actual pipeline with StaleNodeHandler (that is real
SCM's job). -- Recon implements its own ReconPipelineManager I believe just for
its internal bookkeeping.
bq. So my main concern here is - whether it is ok to have reference of recon
task list inside StaleNodeHandler and DeadNodeHandler ?
Sure we can inject the task list inside both handlers? Technically it is doable.
bq. Also StaleNodeHandler don't update container state or remove container
replica from Recon cache on which ContainerHelathTask is dependent on
identifying missing or unhealthy containers, so I don't see any use on
triggering ContainerHealthTask on StaleNodeHandler.
Right. StaleNodeHandler (in SCM, not Recon) only closes the pipelines. We could
trigger {{PipelineSyncTask}}. If you want to do both in the same JIRA, you
could change the title accordingly.
> Recon: Trigger MissingContainerTask when DN becomes stale
> ---------------------------------------------------------
>
> Key: HDDS-5331
> URL: https://issues.apache.org/jira/browse/HDDS-5331
> Project: Apache Ozone
> Issue Type: Task
> Components: Ozone Recon
> Reporter: Siyao Meng
> Assignee: Devesh Kumar Singh
> Priority: Major
> Labels: pull-request-available
>
> Currently, the ContainerHealthTask runs in periodic intervals to check if a
> container is under replicated or missing. A small improvement by triggering a
> run of this task when a DN goes STALE or DEAD can help identify under
> replicated containers quicker.
> CC [~avijayan]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]