Deployment problems caused by file deletion failures
----------------------------------------------------
Key: GERONIMO-3489
URL: https://issues.apache.org/jira/browse/GERONIMO-3489
Project: Geronimo
Issue Type: Bug
Security Level: public (Regular issues)
Components: deployment
Affects Versions: 2.0.1
Reporter: Ted Kirby
Fix For: 2.0.2, 2.0.x, 2.1
File.delete() failures in IOUtil.recursiveDelete() are causing various
deployment problems. I open this JIRA to discuss them to see how the server
might better handle them. In all but one case, delete failures are not even
noted with a log record! Deletion problems are seen in many environments and
platforms, but they are persistently fatal when using a NFS file system for the
repository.
In investigating the problem, I have added code to recursiveDelete to retry the
delete a few times if it fails. I added code to list directory contents if a
directory delete failed, and saw a file named .nfs000000002bc43500000053e in
the directory. My first attempt at a bypass was to retry a failed delete 5
times, sleeping a second before each try. This did not work. I added a call
to System.gc() before each sleep, and this got me passed the problem.
Interestingly, two retries were required to get this to work. In another
version, each retry was a second longer, and I printed all file names in a
directory before trying the delete. This worked in most cases, but required
the full 5 retries, so I suspect System.gc() would have time.
System.runFinalization() would be something else to try.
RepositoryConfigurationStore.createNewConfigurationDir(Artifact) shows the
failing end of the deletion problem, with the dreaded
ConfigurationAlreadyExistsException("Configuration already exists: " +
configId)exception. I think this message is not good. It should really say
directory already exists. If the file is not deleted on undeploy, this failure
occurs on a subsequent deploy. What is really bad is if the user invokes a
redeploy operation, and the file delete fails on the undeploy. It is important
that undeploy not complete until the file goes away.
>From other environments, I am not convinced that all file handles and
>references, and particularly open streams, are being closed on some artifacts.
> This will cause the delete to fail. It may be that the gc() calls are
>cleaning these up, and allowing the deletes to work in my case above.
Another option is that
RepositoryConfigurationStore.createNewConfigurationDir(Artifact) not throw a
ConfigurationAlreadyExistsException if the only problem is an empty directory
structure exists. The next line creates the directory structure anyway.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.