[
https://issues.apache.org/jira/browse/TIKA-346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jukka Zitting resolved TIKA-346.
--------------------------------
Resolution: Fixed
Fix Version/s: 1.0
Assignee: Jukka Zitting
Done in revision 1125861.
> ZipParser throws "invalid compression method" error for some archives
> ---------------------------------------------------------------------
>
> Key: TIKA-346
> URL: https://issues.apache.org/jira/browse/TIKA-346
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.5
> Environment: Windows XP, JVM 1.6.16
> Reporter: Robert Trickey
> Assignee: Jukka Zitting
> Fix For: 1.0
>
> Attachments: TIKA-346.patch, moby.zip
>
>
> This could be a bug in the underlying apache-commons code. When trying to
> parse the attached file to extract text content, an error is thrown with the
> following stacktrace:
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from
> org.apache.tika.parser.pkg.ZipParser@1b963c4
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:122)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:101)
> at my.code.wherever.....
> Caused by: java.lang.IllegalArgumentException: invalid compression method
> at java.util.zip.ZipEntry.setMethod(ZipEntry.java:209)
> at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:146)
> at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
> at
> org.apache.tika.parser.pkg.PackageParser.parseArchive(PackageParser.java:66)
> at org.apache.tika.parser.pkg.ZipParser.parse(ZipParser.java:49)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:120)
> ... 25 more
> I have extracted the content of the zip and ran the autodetect parser against
> all content files without problems, so it is definitely the zip that is the
> problem.
> The attached zip is from Project Gutenberg and hence public domain.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira