[jira] Commented: (SANDBOX-280) unable to extract a TAR file that contains an entry which is 10 GB in size

Sam Smith (JIRA) Tue, 10 Feb 2009 15:50:01 -0800

    [ 
https://issues.apache.org/jira/browse/SANDBOX-280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672456#action_12672456
 ]


Sam Smith commented on SANDBOX-280:
-----------------------------------

> what does a native tar command think of the archive?
>
> I.e. what does "tar tf your-archive.tar" say?

OK, I just re-executed that test, using a new build of commons-compress 
(revision 743098, that I checked out a few hours ago).  I also first changed my 
code to use TarArchiveInputStream/TarArchiveOutputStream.

I get the same behavior: TarArchiveOutputStream writes the 10 GiB TAR file, and 
then TarArchiveInputStream chokes when attempting to read it:
    java.io.IOException: unexpected EOF with 24064 bytes unread
        at 
org.apache.commons.compress.archivers.tar.TarArchiveInputStream.read(TarArchiveInputStream.java:339)
        at 
org.apache.commons.compress.archivers.tar.TarArchiveInputStream.copyEntryContents(TarArchiveInputStream.java:379)

BUT TarArchiveOutputStream must be silently writing corrupt data, because 2 
native code TAR programs bomb on it:

1) I tried extracting with jzip 1.3, but it immediately fails with an error 
dialog

2) I have cygwin installed, so I opened a bash shell and tried your recommended 
"tar tf your-archive.tar" command, and it failed with this output:

$ tar tf tarFile#3.tar
test_archive_extract_fileSizeLimit_shouldPass_dir#1/
test_archive_extract_fileSizeLimit_shouldPass_dir#1/dataFile#2_length10737418239.txt
tar: Skipping to next header
tar: Error exit delayed from previous errors

Probably not very useful of output, but it does seems to prove that 
TarArchiveOutputStreammust be buggy too.

> unable to extract a TAR file that contains an entry which is 10 GB in size
> --------------------------------------------------------------------------
>
>                 Key: SANDBOX-280
>                 URL: https://issues.apache.org/jira/browse/SANDBOX-280
>             Project: Commons Sandbox
>          Issue Type: Bug
>          Components: Compress
>    Affects Versions: Nightly Builds
>         Environment: I am using win xp sp3, but this should be platform 
> independent.
>            Reporter: Sam Smith
>
> I made a TAR file which contains a file entry where the file is 10 GB in size.
> When I attempt to extract the file using TarInputStream, it fails with the 
> following stack trace:
>       java.io.IOException: unexpected EOF with 24064 bytes unread
>               at 
> org.apache.commons.compress.archivers.tar.TarInputStream.read(TarInputStream.java:348)
>               at 
> org.apache.commons.compress.archivers.tar.TarInputStream.copyEntryContents(TarInputStream.java:388)
> So, TarInputStream does not seem to support large (> 8 GB?) files.
> Here is something else to note: I created that TAR file using TarOutputStream 
> , which did not complain when asked to write a 10 GB file into the TAR file, 
> so I assume that TarOutputStream has no file size limits?  That, or does it 
> silently create corrupted TAR files (which would be the worst situation of 
> all...)?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SANDBOX-280) unable to extract a TAR file that contains an entry which is 10 GB in size

Reply via email to