[
https://issues.apache.org/jira/browse/HDFS-16322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444434#comment-17444434
]
Tsz-wo Sze edited comment on HDFS-16322 at 11/16/21, 10:21 AM:
---------------------------------------------------------------
[~Nsupyq], do you have any suggestions fixing it?
As [~shv] mentioned, other operations also have the same problem. Consider
delete below:
# client c0 delete a file.
# client c1 check foo not existed and then create a new file at the same path.
# client c0 retry deleting it.
was (Author: szetszwo):
Do you have any suggestion fixing it?
As [~shv] mentioned, other operations also have the same problem. Consider
delete below:
# client c0 delete a file.
# client c1 check foo not existed and then create a new file at the same path.
# client c0 retry deleting it.
> The NameNode implementation of ClientProtocol.truncate(...) can cause data
> loss.
> --------------------------------------------------------------------------------
>
> Key: HDFS-16322
> URL: https://issues.apache.org/jira/browse/HDFS-16322
> Project: Hadoop HDFS
> Issue Type: Bug
> Environment: The runtime environment is Ubuntu 18.04, Java 1.8.0_222
> and Apache Maven 3.6.0.
> The bug can be reproduced by the the testMultipleTruncate() in the
> attachment. First, replace the file TestFileTruncate.java under the directory
> "hadoop-3.3.1-src/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/"
> with the attachment. Then run "mvn test
> -Dtest=org.apache.hadoop.hdfs.server.namenode.TestFileTruncate#testMultipleTruncate"
> to run the testcase. Finally the "assertFileLength(p, n+newLength)" at 199
> line of TestFileTruncate.java will abort. Because the retry of truncate()
> changes the file size and cause data loss.
> Reporter: nhaorand
> Priority: Major
> Attachments: TestFileTruncate.java
>
>
> The NameNode implementation of ClientProtocol.truncate(...) can cause data
> loss. If dfsclient drops the first response of a truncate RPC call, the retry
> by retry cache will truncate the file again and cause data loss.
> HDFS-7926 avoids repeated execution of truncate(...) by checking if the file
> is already being truncated with the same length. However, under concurrency,
> after the first execution of truncate(...), concurrent requests from other
> clients may append new data and change the file length. When truncate(...) is
> retried after that, it will find the file has not been truncated with the
> same length and truncate it again, which causes data loss.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]