[
https://issues.apache.org/jira/browse/HDDS-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Rose updated HDDS-8140:
-----------------------------
Description:
This message was observed in the logs on startup for 250 containers. The same
message occurred on 3/6 datanodes for the same set of containers on each node:
{code}
2023-02-16 16:43:05,799 WARN
org.apache.hadoop.ozone.container.common.impl.ContainerSet: Adding container
64079 to missing container set.
{code}
This is a byproduct of HDDS-935, which is an old and involved change. In that
change there is a TODO in
[{{ContainerStateMachine#persistContainerSet}}|https://github.com/apache/ozone/blob/7f22916889b7bf39cdb31e5943cae5768f368198/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L279]
that says there will be a race if block delete does not go through Ratis. When
block deletion was implemented after that change, it did not go through Ratis
so the race may happen. We need to revisit this area of the code.
was:
This message was observed in the logs on startup for 250 containers. The same
message occurred on 3/6 datanodes for the same set of containers on each node:
{code}
2023-02-16 16:43:05,799 WARN
org.apache.hadoop.ozone.container.common.impl.ContainerSet: Adding container
64079 to missing container set.
{code}
This is a byproduct of HDDS-935, which is an old and involved change. In that
change there is a TODO in {{ContainerStateMachine#persistContainerSet}} that
says there will be a race if block delete does not go through Ratis. When block
deletion was implemented after that change, it did not go through Ratis so the
race may happen. We need to revisit this area of the code.
> Startup warning about adding containers to missing container set
> ----------------------------------------------------------------
>
> Key: HDDS-8140
> URL: https://issues.apache.org/jira/browse/HDDS-8140
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Ethan Rose
> Priority: Major
>
> This message was observed in the logs on startup for 250 containers. The same
> message occurred on 3/6 datanodes for the same set of containers on each node:
> {code}
> 2023-02-16 16:43:05,799 WARN
> org.apache.hadoop.ozone.container.common.impl.ContainerSet: Adding container
> 64079 to missing container set.
> {code}
> This is a byproduct of HDDS-935, which is an old and involved change. In that
> change there is a TODO in
> [{{ContainerStateMachine#persistContainerSet}}|https://github.com/apache/ozone/blob/7f22916889b7bf39cdb31e5943cae5768f368198/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L279]
> that says there will be a race if block delete does not go through Ratis.
> When block deletion was implemented after that change, it did not go through
> Ratis so the race may happen. We need to revisit this area of the code.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]