[
https://issues.apache.org/jira/browse/TIKA-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-3976.
-------------------------------
Fix Version/s: 2.7.1
Resolution: Fixed
> Allow users to configure behavior for zero-byte files
> -----------------------------------------------------
>
> Key: TIKA-3976
> URL: https://issues.apache.org/jira/browse/TIKA-3976
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Minor
> Fix For: 2.7.1
>
>
> We currently throw a ZeroByteFileException whenever the stream is empty in
> AutoDetectParser.
> I _think_ the reason we did this was for use cases in search systems, where
> it would be exceptional to send in a zero-byte file.
> For other use cases, though, especially for embedded files, it is kind of
> normal to have zero-byte contents but have meaningful metadata.
> So, embedded files generally are one place (as in .ppt, etc.), but WARC
> redirects and HTTPResponse files would be other types of containers that may
> include meaningful metadata in the embedded file, but the embedded file has a
> zero-byte stream.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)