[ 
https://issues.apache.org/jira/browse/TIKA-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17202246#comment-17202246
 ] 

Hudson commented on TIKA-3196:
------------------------------

SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk8 #27 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk8/27/])
Fix TIKA-3196 (#364) (github: 
[https://github.com/apache/tika/commit/aba3e433510f02300ff627df74d09fdfb372cf38])
* (edit) 
tika-parser-modules/tika-parser-pkg-module/src/main/java/org/apache/tika/parser/pkg/PackageParser.java
* (add) 
tika-parser-modules/tika-parser-pkg-module/src/test/resources/test-documents/testZip_with_DataDescriptor.zip
* (edit) 
tika-parser-modules/tika-parser-pkg-module/src/test/java/org/apache/tika/parser/pkg/ZipParserTest.java


> PackageParser should attempt to parse entries from zip files with STORED 
> entries with data descriptor
> -----------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-3196
>                 URL: https://issues.apache.org/jira/browse/TIKA-3196
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Trevor Bentley
>            Priority: Major
>         Attachments: OOO-107047-0.oxt-145.zip
>
>
> We are currently using tika for text extraction. Currently some sites are 
> returning zips that have entries with stored data descriptors which fail to 
> extract due to the ZipArchiveInputStream (in commons-compress) defaulting to 
> false for 'allowStoredEntriesWithDataDescriptor'.
> Since ZipArchiveInputStream has support for reading zips with data 
> descriptors we should attempt to read the zip with that feature enabled when 
> we get a data descriptor UnsupportedZipFeatureException.
> Pull Request: 
> [https://github.com/apache/tika/pull/356|https://github.com/apache/tika/pull/355]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to