[
https://issues.apache.org/jira/browse/COMPRESS-357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15326670#comment-15326670
]
Sebb commented on COMPRESS-357:
-------------------------------
There are two cases to consider:
A) the main code runs finish(), and finalize must be prevented from calling
finish() again.
B) the main code does not run finish(), so finalize must call finish()
Case A is easy to handle; just ensure that the indicator variable (out) is
properly published.
In case B, the finalize method detects that it needs to call finish().
So far so good, the finalizer thread has only accessed the properly published
variable.
But the finish() method will only work properly if it sees the correct values
for all the variables it uses.
Other variables have not been safely published, so the values seen by the
finalizer thread may be arbitrarily stale.
Note that this is not a problem of concurrent access - it is a problem of safe
publication, i.e. ensuring that the variable values written in the main thread
are seen by the finalizer thread.
This is exactly the problem reported in the description; the correct out field
value was not seen by the finalizer thread.
One way to fix this would be to make *all* the variables used by finish()
volatile.
I don't know if that would cause a serious performance hit.
> BZip2CompressorOutputStream can affect output stream incorrectly
> ----------------------------------------------------------------
>
> Key: COMPRESS-357
> URL: https://issues.apache.org/jira/browse/COMPRESS-357
> Project: Commons Compress
> Issue Type: Bug
> Components: Compressors
> Affects Versions: 1.9, 1.11
> Environment: multithreaded
> Reporter: Richard Shapiro
> Labels: easyfix
> Fix For: 1.12
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> BZip2CompressorOutputStream has an unsynchronized finished() method, and an
> unsynchronized finalize method. Finish checks to see if the output stream is
> null, and if it is not it calls various methods, some of which write to the
> output stream.
> Now, consider something like this sequence.
> BZip2OutputStream s = ...
> ...
> s.close();
> s = null;
> After the s = null, the stream is garbage. At some point the garbage
> collector call finalize(), which calls finish(). But, since the GC may be on
> a different thread, there is no guarantee that the assignment this.out = null
> in finish() has actually been made visible to the GC thread, which results in
> bad data in the output stream.
> This is not a theoretical problem; In a part of a large project I'm working
> on, this happens about 2% of the time.
> The fixes are simple
> 1) synchronize finish() or
>
> 2) don't call finish from finalize().
> A workaround is to derive a class and override the finalize() method.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)