Ray Navarette created JCR-4049:
----------------------------------
Summary: BLOBInTempFile deletes underlying file before it can be
used
Key: JCR-4049
URL: https://issues.apache.org/jira/browse/JCR-4049
Project: Jackrabbit Content Repository
Issue Type: Bug
Components: jackrabbit-core
Affects Versions: 2.12.1
Reporter: Ray Navarette
Consider the following:
{code}
InputStream in = node.getProperty(name).getBinary().getStream();
Object result = processTheStream(in); // IOException sometimes thrown
{code}
When a BLOBInTempFile is is no longer referenced, its finalizer deletes the
underlying temp file. However, an InputStream created from Binary.getStream()
may still be in use and the deletion causes an IOException to be thrown. I
didn't see anywhere in the javadoc mention that the reference to a given Binary
object would need to be retained in order for the underlying stream to still be
valid, so this seems to be a side effect of the specific implementation in its
handling of the temp file cleanup.
This issue can be difficult to track down due to the fact that garbage
collection plays a role in when the references are cleaned and hence when the
files are deleted.
I believe either of these options would be sufficient to fix the problem:
1) Update BLOBInTempFile.getStream() to return an InputStream implementation
that maintains a reference to the BLOBInTempFile object that creates it. This
way there is a valid reference to the BLOBInTempFile object until the input
stream itself goes out of scope, preventing the file from being prematurely
deleted. I've done this at the application level as a hack by extending
FilterInputStream and maintaining an unused reference to the Binary, and this
solution seems to do the trick.
2) Update BLOBInTempFile finalizer to _not_ delete the underlying file. This
may not be sufficient in the case where the file is not constructed by the
TransientFileFactory, but the actual usage may cover this case. If we know the
temp file was properly created though, it should be cleaned as all other temp
files are when there are no more references. This would mean that the
reference held in the InputStream would be sufficient to keep the file in
existence until it was used, even if the BLOBInTempFile was no longer in scope.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)