Tim Allison created TIKA-3701:
---------------------------------

             Summary: ZipDetector on a file should back off to streaming 
detection on failure to open a zipfile
                 Key: TIKA-3701
                 URL: https://issues.apache.org/jira/browse/TIKA-3701
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


If a file is passed to Tika wrapped as a TikaInputStream with an underlying 
file, the DefaultZipDetector tries to open a ZipFile.  If there's a truncated 
file or if that ZipFile open fails, the DefaultZipDetector effectively gives up.

Given that there's still a file available, we should try to do a streaming 
detect by reopening the file as a regular InputStream.

If we don't do this, we wind up getting different detection for some truncated 
ooxml if the user sends in a file vs a stream.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to