[ 
https://issues.apache.org/jira/browse/HDDS-3853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147988#comment-17147988
 ] 

Marton Elek commented on HDDS-3853:
-----------------------------------

Same comment as on HDDS-3852:

We discussed it during the Community Meeting. It seems to be hard to reproduce 
the problem, therefore we moved out from 0.7.0.  Feel free to move it back if 
you think it's important to fix (especially as you -- as the release manager -- 
have the final decision). 

Personally I think we need more test with long-running Ozone clusters. The 
upgrade tests introduced by Attila might also help. 

If you have any more logs or any information, please share, and we can 
investigate. 

> Container marked as missing on datanode while container directory do exist
> --------------------------------------------------------------------------
>
>                 Key: HDDS-3853
>                 URL: https://issues.apache.org/jira/browse/HDDS-3853
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>            Reporter: Sammi Chen
>            Assignee: Shashikant Banerjee
>            Priority: Major
>
> {code}
> INFO org.apache.hadoop.ozone.container.common.impl.HddsDispatcher: Operation: 
> PutBlock , Trace ID: 487c959563e884b9:509a3386ba37abc6:487c959563e884b9:0 , 
> Message: ContainerID 1744 has been lost and and cannot be recreated on this 
> DataNode , Result: CONTAINER_MISSING , StorageContainerException Occurred.
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  ContainerID 1744 has been lost and and cannot be recreated on this DataNode
>         at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:238)
>         at 
> org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:166)
>         at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:395)
>         at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.runCommand(ContainerStateMachine.java:405)
>         at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$applyTransaction$6(ContainerStateMachine.java:749)
>         at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
>  ERROR 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine:
>  gid group-1376E41FD581 : ApplyTransaction failed. cmd PutBlock logIndex 
> 40079 msg : ContainerID 1744 has been lost and and cannot be recreated on 
> this DataNode Container Result: CONTAINER_MISSING
>  ERROR 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis:
>  pipeline Action CLOSE on pipeline 
> PipelineID=de21dfcf-415c-4901-84ca-1376e41fd581.Reason : Ratis Transaction 
> failure in datanode 33b49c34-caa2-4b4f-894e-dce7db4f97b9 with role FOLLOWER 
> .Triggering pipeline close action
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to