slfan1989 commented on code in PR #7090:
URL: https://github.com/apache/ozone/pull/7090#discussion_r1724505732
##########
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/StateContext.java:
##########
@@ -112,7 +119,7 @@ public class StateContext {
private final Map<InetSocketAddress, List<Message>>
incrementalReportsQueue;
private final Map<InetSocketAddress, Queue<ContainerAction>>
containerActions;
Review Comment:
Regarding lifeline reporting, I understand that this is a standard operation
in HDFS. However, I have concerns about the current implementation of this
feature.
For example, if a pipeline causes a DataNode (DN) to become
unavailable—meaning the DN cannot provide data services and the client cannot
retrieve data from it—then marking the DN as DEAD is reasonable.
However, if there is a lifeline, the DN may appear to be healthy even though
it is actually not, which can prevent maintenance personnel from detecting the
issue.
Lifeline reporting is more suitable for scenarios where heavy operations
impact the heartbeat, but the heartbeat can recover once the heavy operation is
complete. Both pipeline and container reporting are lightweight, and from my
perspective, I haven't observed these reports causing any significant load on
the SCM.
cc: @ChenSammi
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]