Sebastian Nagel created TIKA-2675:
-------------------------------------
Summary: OpenDocumentParser should fail on invalid zip files
Key: TIKA-2675
URL: https://issues.apache.org/jira/browse/TIKA-2675
Project: Tika
Issue Type: Bug
Components: parser
Reporter: Sebastian Nagel
The OpenDocumentParser assumes a zip file as container. However, if it is
called on an invalid zip stream from a remote URL (see NUTCH-2603), the parser
signals success and returns a document with no/empty content. The behavior is
different when called on a local file: while the [constructor of
ZipFile|https://docs.oracle.com/javase/8/docs/api/java/util/zip/ZipFile.html#ZipFile-java.io.File-]
fails on invalid input, the [constructor of
ZipInputStream|https://docs.oracle.com/javase/8/docs/api/java/util/zip/ZipInputStream.html#ZipInputStream-java.io.InputStream-]
silently ignores the input.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)