[ https://issues.apache.org/jira/browse/HDDS-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968213#comment-16968213 ]
Marton Elek commented on HDDS-2372: ----------------------------------- Yes, I tried this, but it doesn't work. Assuming the algorithm is the following: # file := finalPath # if (!file.exists()) file := tmpPath # if (!file.exist()) file:= finalPath If the move happens between 2 and 3 the file value will be tmpPath instead of finalPath. One option is to catch the FileNotFound exception and retry. But there is an other (slightly different) question: What about having read and write at the same time? How is it guaranteed that the writeStateMachineData is finished before the next readStateMachineData is started. Is it guaranteed by Ratis? > Datanode pipeline is failing with NoSuchFileException > ----------------------------------------------------- > > Key: HDDS-2372 > URL: https://issues.apache.org/jira/browse/HDDS-2372 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Reporter: Marton Elek > Assignee: Shashikant Banerjee > Priority: Critical > > Found it on a k8s based test cluster using a simple 3 node cluster and > HDDS-2327 freon test. After a while the StateMachine become unhealthy after > this error: > {code:java} > datanode-0 datanode java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > java.nio.file.NoSuchFileException: > /data/storage/hdds/2a77fab9-9dc5-4f73-9501-b5347ac6145c/current/containerDir0/1/chunks/gGYYgiTTeg_testdata_chunk_13931.tmp.2.20830 > {code} > Can be reproduced. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org