[
https://issues.apache.org/jira/browse/TIKA-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147331#comment-16147331
]
Matthew Caruana Galizia commented on TIKA-2450:
-----------------------------------------------
OK, with that in mind then I will agree you.
In the same way that all the various encrypted document exceptions are
normalised to a {{tika.exception.EncryptedDocumentException}} then I think it
would be useful to, as Tim suggests, normalise empty file exceptions to a
{{tika.exception.ZeroByteFileException}} (that extends {{TikaException}}).
> OfficeParser.parse called for zero-byte file with .doc extension
> ----------------------------------------------------------------
>
> Key: TIKA-2450
> URL: https://issues.apache.org/jira/browse/TIKA-2450
> Project: Tika
> Issue Type: Bug
> Components: detector, parser
> Affects Versions: 1.16
> Reporter: Matthew Caruana Galizia
> Priority: Minor
>
> A zero-byte (empty) file with a .doc extension is detected as a Word Document
> and the {{OfficeParser.parse}} method is called for this file.
> We then get a {{TikaException}}, with the cause given as an
> {{org.apache.poi.EmptyFileException}}.
> I think it would be more useful if the file were NOT detected as a Word
> Document, meaning that the {{AutoDetectParser}} would then fall back to
> whatever is set as the fallback parser in the parse context.
> This is more useful because the user can then trigger some special logic for
> handling empty files.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)