[
https://issues.apache.org/jira/browse/FLINK-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15853923#comment-15853923
]
ASF GitHub Bot commented on FLINK-5659:
---------------------------------------
Github user zentol commented on a diff in the pull request:
https://github.com/apache/flink/pull/3219#discussion_r99576850
--- Diff: flink-core/src/main/java/org/apache/flink/util/FileUtils.java ---
@@ -148,14 +158,49 @@ public static void deleteDirectory(File directory)
throws IOException {
return;
}
- // delete the directory. this fails if the directory is
not empty, meaning
- // if new files got concurrently created. we want to
fail then.
- try {
- Files.delete(directory.toPath());
- }
- catch (NoSuchFileException ignored) {
- // if someone else deleted this concurrently,
we don't mind
- // the result is the same for us, after all
+ java.nio.file.Path directoryPath = directory.toPath();
+ if (OperatingSystem.isWindows()) {
+ // delete the directory. this fails if the
directory is not empty, meaning
+ // if new files got concurrently created. we
want to fail then.
+ try {
+ Files.delete(directoryPath);
+ } catch (NoSuchFileException ignored) {
+ // if someone else deleted this
concurrently, we don't mind
+ // the result is the same for us, after
all
+ } catch (AccessDeniedException e) {
+ // This may occur on Windows if another
process is currently
+ // deleting the file, since the file
must be opened in order
+ // to delete it. We double check here
to make sure the file
+ // was actually deleted by another
process. Note that this
+ // isn't a perfect solution, but it's
better than nothing.
+ if (Files.exists(directoryPath)) {
+ throw e;
+ }
+ } catch (DirectoryNotEmptyException e) {
+ // This may occur on Windows for some
reason even for empty
+ // directories. Apparently there's a
timing/visibility
+ // issue when concurrently deleting the
contents of a directory
+ // and afterwards deleting the
directory itself.
+ try {
+ Thread.sleep(50);
--- End diff --
It only happens when multiple threads are involved; running the test with 1
thread works like charm.
This whole thing is just strange. I've made some debugging and what
happened is that multiple threads delete the same file, and verify the deletion
using `Files.exists(filename)`. All of them pass. A few seconds later we hit
the `DirectoryNotEmptyException`, check the contents again and what do you
know, the file *which we just verified to be deleted* still exists.
> FileBaseUtils#deleteFileOrDirectory not thread-safe on Windows
> --------------------------------------------------------------
>
> Key: FLINK-5659
> URL: https://issues.apache.org/jira/browse/FLINK-5659
> Project: Flink
> Issue Type: Bug
> Components: Core, Local Runtime
> Affects Versions: 1.2.0, 1.3.0
> Reporter: Chesnay Schepler
> Assignee: Chesnay Schepler
> Priority: Trivial
>
> The {code}FileBaseUtils#deleteFileOrDirectory{code} is not thread-safe on
> Windows.
> First you will run into AccessDeniedExceptions since one thread tried to
> delete a file while another thread was already doing that, for which the file
> has to be opened.
> Once you resolve those exceptions (by catching them double checking whether
> the file still exists), you run into DirectoryNotEmptyExceptions since there
> is some wacky timing/visibility issue when deleting files concurrently.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)