[
https://issues.apache.org/jira/browse/HDFS-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16406957#comment-16406957
]
genericqa commented on HDFS-10323:
----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color}
| {color:red} HDFS-10323 does not apply to trunk. Rebase required? Wrong
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-10323 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12896812/HDFS-10323.003.patch |
| Console output |
https://builds.apache.org/job/PreCommit-HDFS-Build/23567/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org |
This message was automatically generated.
> transient deleteOnExit failure in ViewFileSystem due to close() ordering
> ------------------------------------------------------------------------
>
> Key: HDFS-10323
> URL: https://issues.apache.org/jira/browse/HDFS-10323
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: federation
> Affects Versions: 2.6.0, 2.7.4, 3.0.0-beta1
> Reporter: Ben Podgursky
> Assignee: Wenxin He
> Priority: Major
> Attachments: HDFS-10323.001.patch, HDFS-10323.002.patch,
> HDFS-10323.003.patch
>
>
> After switching to using a ViewFileSystem, fs.deleteOnExit calls began
> failing frequently, displaying this error on failure:
> 16/04/21 13:56:24 INFO fs.FileSystem: Ignoring failure to deleteOnExit for
> path /tmp/delete_on_exit_test_123/a438afc0-a3ca-44f1-9eb5-010ca4a62d84
> Since FileSystem eats the error involved, it is difficult to be sure what the
> error is, but I believe what is happening is that the ViewFileSystem’s child
> FileSystems are being close()’d before the ViewFileSystem, due to the random
> order ClientFinalizer closes FileSystems; so then when the ViewFileSystem
> tries to close(), it tries to forward the delete() calls to the appropriate
> child, and fails because the child is already closed.
> I’m unsure how to write an actual Hadoop test to reproduce this, since it
> involves testing behavior on actual JVM shutdown. However, I can verify that
> while
> {code:java}
> fs.deleteOnExit(randomTemporaryDir);
> {code}
> regularly (~50% of the time) fails to delete the temporary directory, this
> code:
> {code:java}
> ViewFileSystem viewfs = (ViewFileSystem)fs1;
> for (FileSystem fileSystem : viewfs.getChildFileSystems()) {
> if (fileSystem.exists(randomTemporaryDir)) {
> fileSystem.deleteOnExit(randomTemporaryDir);
> }
>
}
> {code}
> always successfully deletes the temporary directory on JVM shutdown.
> I am not very familiar with FileSystem inheritance hierarchies, but at first
> glance I see two ways to fix this behavior:
> 1) ViewFileSystem could forward deleteOnExit calls to the appropriate child
> FileSystem, and not hold onto that path itself.
> 2) FileSystem.Cache.closeAll could first close all ViewFileSystems, then all
> other FileSystems.
> Would appreciate any thoughts of whether this seems accurate, and thoughts
> (or help) on the fix.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]