[ 
https://issues.apache.org/jira/browse/COMPRESS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541107#comment-13541107
 ] 

Peter De Maeyer commented on COMPRESS-206:
------------------------------------------

Indeed, TarArchiveOutputStream does not write garbage as such. Initially I 
thought the second EOF block was "garbage". When I dug into the code I 
understood that the second EOF block was intentional. I then realized that the 
actual problem was that TarArchiveInputStream did not read back the second EOF 
block.

Anyway, the point is that there is a unit test illustrating my use case, and 
the patch fixes it. COMPRESS-202 is merely about documentation, but my patch 
really fixes _behavior_. So I would argue that my patch does a lot more than 
just address COMPRESS-202...

Rephrasing the issue: "TarArchiveOutputStream sometimes writes bytes at the end 
of the archive which are never consumed by TarArchiveInputStream".
                
> TarArchiveOutputStream sometimes writes garbage beyond the end of the archive
> -----------------------------------------------------------------------------
>
>                 Key: COMPRESS-206
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-206
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Compressors
>    Affects Versions: 1.0, 1.4.1
>         Environment: Linux x86
>            Reporter: Peter De Maeyer
>             Fix For: 1.5
>
>         Attachments: COMPRESS-206.patch
>
>
> For some combinations of file lengths, the archive created by 
> TarArchiveOutputStream writes garbage beyond the end of the TAR stream. 
> TarArchiveInputStream can still read the stream without problems, but it does 
> not read beyond the garbage. This is problematic for my use case because I 
> write a checksum _after_ the TAR content. If I then try to read the checksum 
> back, I read garbage instead.
> Functional impact:
> * TarArchiveInputStream is asymmetrical with respect to 
> TarArchiveOutputStream, in the sense that TarArchiveInputStream does not read 
> everything that was written by TarArchiveOutputStream.
> * The content is unnecessarily large. The garbage is totally unnecessarily 
> large: ~10K overhead compared to Linux command-line tar.
> This symptom is remarkably similar to #COMPRESS-81, which is supposedly fixed 
> since 1.1. Except for the fact that this issue still exists... I've tested 
> this with 1.0 and 1.4.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to