Sebastian Nagel created TIKA-2675:
-------------------------------------

             Summary: OpenDocumentParser should fail on invalid zip files
                 Key: TIKA-2675
                 URL: https://issues.apache.org/jira/browse/TIKA-2675
             Project: Tika
          Issue Type: Bug
          Components: parser
            Reporter: Sebastian Nagel


The OpenDocumentParser assumes a zip file as container. However, if it is 
called on an invalid zip stream from a remote URL (see NUTCH-2603), the parser 
signals success and returns a document with no/empty content. The behavior is 
different when called on a local file: while the [constructor of 
ZipFile|https://docs.oracle.com/javase/8/docs/api/java/util/zip/ZipFile.html#ZipFile-java.io.File-]
 fails on invalid input, the [constructor of 
ZipInputStream|https://docs.oracle.com/javase/8/docs/api/java/util/zip/ZipInputStream.html#ZipInputStream-java.io.InputStream-]
 silently ignores the input.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to