[ https://issues.apache.org/jira/browse/TIKA-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17202278#comment-17202278 ]
Hudson commented on TIKA-3196: ------------------------------ SUCCESS: Integrated in Jenkins build Tika ยป tika-branch1x-jdk8 #17 (See [https://ci-builds.apache.org/job/Tika/job/tika-branch1x-jdk8/17/]) Fix TIKA-3196 (#364) (tallison: [https://github.com/apache/tika/commit/9736af8d8df86cd974eaaa7b27e566af83cfb6c4]) * (edit) tika-parsers/src/test/java/org/apache/tika/parser/pkg/ZipParserTest.java * (edit) tika-parsers/src/main/java/org/apache/tika/parser/pkg/PackageParser.java * (add) tika-parsers/src/test/resources/test-documents/testZip_with_DataDescriptor2.zip * (add) tika-parsers/src/test/resources/test-documents/testZip_with_DataDescriptor.zip > PackageParser should attempt to parse entries from zip files with STORED > entries with data descriptor > ----------------------------------------------------------------------------------------------------- > > Key: TIKA-3196 > URL: https://issues.apache.org/jira/browse/TIKA-3196 > Project: Tika > Issue Type: Bug > Components: parser > Reporter: Trevor Bentley > Priority: Major > Attachments: OOO-107047-0.oxt-145.zip > > > We are currently using tika for text extraction. Currently some sites are > returning zips that have entries with stored data descriptors which fail to > extract due to the ZipArchiveInputStream (in commons-compress) defaulting to > false for 'allowStoredEntriesWithDataDescriptor'. > Since ZipArchiveInputStream has support for reading zips with data > descriptors we should attempt to read the zip with that feature enabled when > we get a data descriptor UnsupportedZipFeatureException. > Pull Request: > [https://github.com/apache/tika/pull/356|https://github.com/apache/tika/pull/355] -- This message was sent by Atlassian Jira (v8.3.4#803005)