[
https://issues.apache.org/jira/browse/HDDS-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483488#comment-17483488
]
Ethan Rose edited comment on HDDS-6236 at 1/31/22, 10:24 PM:
-------------------------------------------------------------
A sample of one of these containers on the datanode showed:
BCSID: 50606
block count: 0
bytes used: -2097152
pending delete block count: 0
delete transaction: 48958
The chunks directory was present but empty. The container was still on the
datanode it was created on.
was (Author: erose):
A sample of one of these containers on the datanode showed:
BCSID: 50606
block count: 0
bytes used: -20971520
pending delete block count: 0
delete transaction: 48958
The chunks directory was present but empty. The container was still on the
datanode it was created on.
> SCM receives reports of unknown containers
> -------------------------------------------
>
> Key: HDDS-6236
> URL: https://issues.apache.org/jira/browse/HDDS-6236
> Project: Apache Ozone
> Issue Type: Bug
> Components: SCM
> Reporter: Ethan Rose
> Priority: Major
>
> We have noticed the following log messages in SCM leader and followers for
> multiple containers:
> {code:java}
> 2022-01-19 12:53:24,021 ERROR
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Received
> container report for an unknown container 1368 from datanode \{ ... }
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: ID #1368
> at
> org.apache.hadoop.hdds.scm.container.ContainerManagerImpl.lambda$getContainer$0(ContainerManagerImpl.java:147)
> at java.base/java.util.Optional.orElseThrow(Optional.java:408)
> at
> org.apache.hadoop.hdds.scm.container.ContainerManagerImpl.getContainer(ContainerManagerImpl.java:147)
> at
> org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:94)
> at
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:165)
> at
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:133)
> at
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:48)
> at
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
> The cluster is currently running SCM HA, but the issue was observed when it
> was a non-HA cluster as well. This seems to only affect empty containers,
> since no data appears to be missing. Containers are supposed to exist in SCM
> DB even after they have been deleted from the datanode, so there seems to be
> some kind of bug in the container persistence logic.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]