[
https://issues.apache.org/jira/browse/FLINK-33189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772095#comment-17772095
]
Matthias Pohl commented on FLINK-33189:
---------------------------------------
Hi [~vladov], thanks for bringing this to the community's attention. You're
right: The JavaDoc is misleading. This part of the code isn't my field of
expertise. But looking at the code, it's most likely that Flink's FileSystem
interface was derived from [Hadoop's FileSystem
interface|https://github.com/apache/hadoop/blob/rel/release-2.10.2/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1551]
(which has the exact same requirement stated in the JavaDoc).
Flink's general {{FileSystem}} implementations do not seem to follow this
contract properly (the {{FileSystem}} contract is generally not well
implemented based on my experience).
[LocalFileSystem|https://github.com/apache/flink/blob/c6997c97c575d334679915c328792b8a3067cfb5/flink-core/src/main/java/org/apache/flink/core/fs/local/LocalFileSystem.java#L174]
only fails if the directory is not empty or doesn't exist. [Flink's
HadoopFileSystem|https://github.com/apache/flink/blob/0e4a6661deefa06039f422148164b8f765eafc98/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopFileSystem.java]
relies on [Hadoop's
S3AFileSystem|https://github.com/apache/hadoop/blob/rel/release-2.10.2/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L1449]
where the delete call also only fails if the directory is not empty. The
customized delete functionality of the
[FlinkS3PrestoFileSystem|https://github.com/apache/flink/blob/4907f5fb7033b1f918c4e721a1ccf95878f980a2/flink-filesystems/flink-s3-fs-presto/src/main/java/org/apache/flink/fs/s3presto/FlinkS3PrestoFileSystem.java#L62]
doesn't do the content check but rather relies on the [delete functionality of
PrestoS3FileSystem|https://github.com/prestodb/presto/blob/0.272/presto-hive/src/main/java/com/facebook/presto/hive/s3/PrestoS3FileSystem.java#L475].
{{PrestoS3FileSystem}} actually follows the contract and throws an
{{IOException}} if {{false}} is passed (without checking whether the directory
is empty; even though the error message of the {{IOException}} states something
else).
This analysis proves my point that the {{FileSystem}} contract is not properly
followed. Generally, the Flink 2.0 efforts would be a great opportunity to
align the {{FileSystem}} contract.
> FsCompletedCheckpointStorageLocation#disposeStorageLocation non-recursively
> deletes a directory
> -----------------------------------------------------------------------------------------------
>
> Key: FLINK-33189
> URL: https://issues.apache.org/jira/browse/FLINK-33189
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Reporter: Vlado Vojdanovski
> Priority: Major
> Original Estimate: 10m
> Remaining Estimate: 10m
>
> FsCompletedCheckpointStorageLocation attempts to non-recursively delete a
> directory
> [https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/filesystem/FsCompletedCheckpointStorageLocation.java#L74]
> However, per the documentation of Flink's FileSystem Interface, such attempts
> are explicitly expected to fail.
> {code:java}
> * @param recursive if path is a directory and set to <code>true</code>,
> the directory is
> * deleted else throws an exception. In case of a file the recursive
> can be set to either
> * <code>true</code> or <code>false</code>{code}
> [https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java#L689]
> I am sure there is a non-negligible chance I am missing some flink internals
> here considering the class has not been touched since 2018 but my read is the
> above is either a bug, or it would be nice to update the FileSystem#delete
> docs.
> Thanks for taking a look :) !
--
This message was sent by Atlassian Jira
(v8.20.10#820010)