[
https://issues.apache.org/jira/browse/HDDS-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968767#comment-16968767
]
Anu Engineer commented on HDDS-2372:
------------------------------------
In the Chunk write path, we write chunks to a temp file and then rename them to
the file file.
However, until we commit a block, any chunk file is a temp file for real since
no one can see the chunk file name until we commit the ChunkInfo into the
RocksDB.
So if we remove the tmpChunkFile and always write to the real chunk file, this
race condition will go away.
> Datanode pipeline is failing with NoSuchFileException
> -----------------------------------------------------
>
> Key: HDDS-2372
> URL: https://issues.apache.org/jira/browse/HDDS-2372
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Reporter: Marton Elek
> Assignee: Shashikant Banerjee
> Priority: Critical
>
> Found it on a k8s based test cluster using a simple 3 node cluster and
> HDDS-2327 freon test. After a while the StateMachine become unhealthy after
> this error:
> {code:java}
> datanode-0 datanode java.util.concurrent.ExecutionException:
> java.util.concurrent.ExecutionException:
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
> java.nio.file.NoSuchFileException:
> /data/storage/hdds/2a77fab9-9dc5-4f73-9501-b5347ac6145c/current/containerDir0/1/chunks/gGYYgiTTeg_testdata_chunk_13931.tmp.2.20830
> {code}
> Can be reproduced.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]