[
https://issues.apache.org/jira/browse/HDDS-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chung En Lee updated HDDS-6236:
-------------------------------
Target Version/s: 2.2.0
2.2.0 (was: 2.1.0)
<<Bulk update>>
Apache Ozone 2.1.0 release is in progress. I'm updating all unresolved jiras
targeting 2.1.0 to retarget 2.2.0.
> SCM receives reports of unknown containers
> -------------------------------------------
>
> Key: HDDS-6236
> URL: https://issues.apache.org/jira/browse/HDDS-6236
> Project: Apache Ozone
> Issue Type: Bug
> Components: SCM
> Reporter: Ethan Rose
> Assignee: Swaminathan Balachandran
> Priority: Major
>
> We have noticed the following log messages in SCM leader and followers for
> multiple containers:
> {code:java}
> 2022-01-19 12:53:24,021 ERROR
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Received
> container report for an unknown container 1368 from datanode \{ ... }
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: ID #1368
> at
> org.apache.hadoop.hdds.scm.container.ContainerManagerImpl.lambda$getContainer$0(ContainerManagerImpl.java:147)
> at java.base/java.util.Optional.orElseThrow(Optional.java:408)
> at
> org.apache.hadoop.hdds.scm.container.ContainerManagerImpl.getContainer(ContainerManagerImpl.java:147)
> at
> org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:94)
> at
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:165)
> at
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:133)
> at
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:48)
> at
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
> The cluster is currently running SCM HA, but the issue was observed when it
> was a non-HA cluster as well. This seems to only affect empty containers,
> since no data appears to be missing. Containers are supposed to exist in SCM
> DB even after they have been deleted from the datanode, so there seems to be
> some kind of bug in the container persistence logic.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]