[ 
https://issues.apache.org/jira/browse/HDDS-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483488#comment-17483488
 ] 

Ethan Rose edited comment on HDDS-6236 at 1/31/22, 10:24 PM:
-------------------------------------------------------------

A sample of one of these containers on the datanode showed:

BCSID: 50606

block count: 0

bytes used: -2097152

pending delete block count: 0

delete transaction: 48958

The chunks directory was present but empty. The container was still on the 
datanode it was created on.


was (Author: erose):
A sample of one of these containers on the datanode showed:

BCSID: 50606

block count: 0

bytes used: -20971520

pending delete block count: 0

delete transaction: 48958

The chunks directory was present but empty. The container was still on the 
datanode it was created on.

> SCM receives reports of unknown containers 
> -------------------------------------------
>
>                 Key: HDDS-6236
>                 URL: https://issues.apache.org/jira/browse/HDDS-6236
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: SCM
>            Reporter: Ethan Rose
>            Priority: Major
>
> We have noticed the following log messages in SCM leader and followers for 
> multiple containers:
> {code:java}
> 2022-01-19 12:53:24,021 ERROR 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Received 
> container report for an unknown container 1368 from datanode \{ ... }
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: ID #1368
>         at 
> org.apache.hadoop.hdds.scm.container.ContainerManagerImpl.lambda$getContainer$0(ContainerManagerImpl.java:147)
>         at java.base/java.util.Optional.orElseThrow(Optional.java:408)
>         at 
> org.apache.hadoop.hdds.scm.container.ContainerManagerImpl.getContainer(ContainerManagerImpl.java:147)
>         at 
> org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:94)
>         at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:165)
>         at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:133)
>         at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:48)
>         at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
> The cluster is currently running SCM HA, but the issue was observed when it 
> was a non-HA cluster as well. This seems to only affect empty containers, 
> since no data appears to be missing. Containers are supposed to exist in SCM 
> DB even after they have been deleted from the datanode, so there seems to be 
> some kind of bug in the container persistence logic.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to