[ https://issues.apache.org/jira/browse/COMPRESS-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kirk Hardy updated COMPRESS-469: -------------------------------- Description: Using TarArchiveInputStream to parse this tar archive results in a `java.io.IOException: Error detected parsing the header` `Caused by: java.lang.IllegalArgumentException: Invalid byte 78 at offset 8 in '00000000NaN\{NUL}' len=12` The `00000000NaN\{NUL}` value is indeed found in the tar archive at the point that TarArchiveInputStream finds it and errors, but both BSDTar from a bash shell and JTar from a Java environment handle the archive fine, including the file with the malformed header. My commons-compress code is: {code:java} private static List<File> unTar(final File inputFile, final File outputDir) throws FileNotFoundException, IOException, ArchiveException { System.out.println(String.format("Untaring %s to dir %s.", inputFile.getAbsolutePath(), outputDir.getAbsolutePath())); final List<File> untaredFiles = new LinkedList<File>(); final InputStream is = new FileInputStream(inputFile); final TarArchiveInputStream debInputStream = (TarArchiveInputStream) new ArchiveStreamFactory().createArchiveInputStream("tar", is); TarArchiveEntry entry = null; while ((entry = (TarArchiveEntry) debInputStream.getNextEntry()) != null) { final File outputFile = new File(outputDir, entry.getName()); if (entry.isDirectory()) { System.out.println(String.format("Attempting to write output directory %s.", outputFile.getAbsolutePath())); if (!outputFile.exists()) { System.out .println(String.format("Attempting to create output directory %s.", outputFile.getAbsolutePath())); if (!outputFile.mkdirs()) { throw new IllegalStateException( String.format("Couldn't create directory %s.", outputFile.getAbsolutePath())); } } } else { System.out.println(String.format("Creating output file %s.", outputFile.getAbsolutePath())); final OutputStream outputFileStream = new FileOutputStream(outputFile); IOUtils.copy(debInputStream, outputFileStream); outputFileStream.close(); } untaredFiles.add(outputFile); } debInputStream.close(); return untaredFiles; } {code} [^compress-bug-POC.tar.gz] I've attached an archive of my proof-of-concept code that un-archives the problem archive using JTar, then attempts to un-archive it using commons-compress. The problem archive is included. Unfortunately I wasn't involved in the creation of the problem archive, it was downloaded from NPM last year, so I don't have any details about the archival process that caused the malformed header. was: Using TarArchiveInputStream to parse this tar archive results in a `java.io.IOException: Error detected parsing the header` `Caused by: java.lang.IllegalArgumentException: Invalid byte 78 at offset 8 in '00000000NaN\{NUL}' len=12` The `00000000NaN\{NUL}` value is indeed found in the tar archive at the point that TarArchiveInputStream finds it and errors, but both BSDTar from a bash shell and JTar from a Java environment handle the archive fine, including the file with the malformed header. My commons-compress code is: {code:java} private static List<File> unTar(final File inputFile, final File outputDir) throws FileNotFoundException, IOException, ArchiveException { System.out.println(String.format("Untaring %s to dir %s.", inputFile.getAbsolutePath(), outputDir.getAbsolutePath())); final List<File> untaredFiles = new LinkedList<File>(); final InputStream is = new FileInputStream(inputFile); final TarArchiveInputStream debInputStream = (TarArchiveInputStream) new ArchiveStreamFactory().createArchiveInputStream("tar", is); TarArchiveEntry entry = null; while ((entry = (TarArchiveEntry) debInputStream.getNextEntry()) != null) { final File outputFile = new File(outputDir, entry.getName()); if (entry.isDirectory()) { System.out.println(String.format("Attempting to write output directory %s.", outputFile.getAbsolutePath())); if (!outputFile.exists()) { System.out .println(String.format("Attempting to create output directory %s.", outputFile.getAbsolutePath())); if (!outputFile.mkdirs()) { throw new IllegalStateException( String.format("Couldn't create directory %s.", outputFile.getAbsolutePath())); } } } else { System.out.println(String.format("Creating output file %s.", outputFile.getAbsolutePath())); final OutputStream outputFileStream = new FileOutputStream(outputFile); IOUtils.copy(debInputStream, outputFileStream); outputFileStream.close(); } untaredFiles.add(outputFile); } debInputStream.close(); return untaredFiles; } {code} [^compress-bug-POC.tar.gz] I've attached an archive of my proof-of-concept code that un-archives the problem archive using JTar, then attempts to un-archive it using commons-compress. The problem archive is included. Unfortunately I wasn't involved in the creation of the problem archive, it was taken downloaded from NPM last year, so I don't have any details about the archival process that caused the malformed header. > java.io.IOException: Error detected parsing the header on un-archiving. > Archive untars fine in BSDTar and JTar > -------------------------------------------------------------------------------------------------------------- > > Key: COMPRESS-469 > URL: https://issues.apache.org/jira/browse/COMPRESS-469 > Project: Commons Compress > Issue Type: Bug > Affects Versions: 1.18 > Environment: Java version: 1.8.0_192, vendor: Oracle Corporation, > runtime: /Library/Java/JavaVirtualMachines/jdk1.8.0_192.jdk/Contents/Home/jre > Default locale: en_US, platform encoding: UTF-8 > OS name: "mac os x", version: "10.14", arch: "x86_64", family: "mac" > > bsdtar 2.8.3 - libarchive 2.8.3 > Reporter: Kirk Hardy > Priority: Major > Attachments: compress-bug-POC.tar.gz > > > Using TarArchiveInputStream to parse this tar archive results in a > `java.io.IOException: Error detected parsing the header` > `Caused by: java.lang.IllegalArgumentException: Invalid byte 78 at offset 8 > in '00000000NaN\{NUL}' len=12` > > The `00000000NaN\{NUL}` value is indeed found in the tar archive at the point > that TarArchiveInputStream finds it and errors, but both BSDTar from a bash > shell and JTar from a Java environment handle the archive fine, including the > file with the malformed header. My commons-compress code is: > {code:java} > private static List<File> unTar(final File inputFile, final File outputDir) > throws FileNotFoundException, IOException, ArchiveException { > System.out.println(String.format("Untaring %s to dir %s.", > inputFile.getAbsolutePath(), outputDir.getAbsolutePath())); > final List<File> untaredFiles = new LinkedList<File>(); > final InputStream is = new FileInputStream(inputFile); > final TarArchiveInputStream debInputStream = (TarArchiveInputStream) new > ArchiveStreamFactory().createArchiveInputStream("tar", is); > TarArchiveEntry entry = null; > while ((entry = (TarArchiveEntry) debInputStream.getNextEntry()) != null) > { > final File outputFile = new File(outputDir, entry.getName()); > if (entry.isDirectory()) { > System.out.println(String.format("Attempting to write output > directory %s.", outputFile.getAbsolutePath())); > if (!outputFile.exists()) { > System.out > .println(String.format("Attempting to create output directory > %s.", outputFile.getAbsolutePath())); > if (!outputFile.mkdirs()) { > throw new IllegalStateException( > String.format("Couldn't create directory %s.", > outputFile.getAbsolutePath())); > } > } > } > else { > System.out.println(String.format("Creating output file %s.", > outputFile.getAbsolutePath())); > final OutputStream outputFileStream = new > FileOutputStream(outputFile); > IOUtils.copy(debInputStream, outputFileStream); > outputFileStream.close(); > } > untaredFiles.add(outputFile); > } > debInputStream.close(); > return untaredFiles; > } > {code} > [^compress-bug-POC.tar.gz] > I've attached an archive of my proof-of-concept code that un-archives the > problem archive using JTar, then attempts to un-archive it using > commons-compress. The problem archive is included. Unfortunately I wasn't > involved in the creation of the problem archive, it was downloaded from NPM > last year, so I don't have any details about the archival process that caused > the malformed header. -- This message was sent by Atlassian JIRA (v7.6.3#76005)