[ 
https://issues.apache.org/jira/browse/TIKA-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147699#comment-16147699
 ] 

Hudson commented on TIKA-2450:
------------------------------

SUCCESS: Integrated in Jenkins build Tika-trunk #1352 (See 
[https://builds.apache.org/job/Tika-trunk/1352/])
TIKA-2450 -- AutoDetectParser should throw a ZeroByteFileException for 
(tallison: 
[https://github.com/apache/tika/commit/a1533977852307c5095efaebfcc5a896d914a57c])
* (add) 
tika-core/src/main/java/org/apache/tika/exception/ZeroByteFileException.java
* (edit) tika-core/src/main/java/org/apache/tika/parser/AutoDetectParser.java
* (edit) 
tika-parsers/src/test/java/org/apache/tika/parser/AutoDetectParserTest.java
* (edit) CHANGES.txt


> OfficeParser.parse called for zero-byte file with .doc extension
> ----------------------------------------------------------------
>
>                 Key: TIKA-2450
>                 URL: https://issues.apache.org/jira/browse/TIKA-2450
>             Project: Tika
>          Issue Type: Bug
>          Components: detector, parser
>    Affects Versions: 1.16
>            Reporter: Matthew Caruana Galizia
>            Priority: Minor
>             Fix For: 1.17
>
>
> A zero-byte (empty) file with a .doc extension is detected as a Word Document 
> and the {{OfficeParser.parse}} method is called for this file.
> We then get a {{TikaException}}, with the cause given as an 
> {{org.apache.poi.EmptyFileException}}.
> I think it would be more useful if the file were NOT detected as a Word 
> Document, meaning that the {{AutoDetectParser}} would then fall back to 
> whatever is set as the fallback parser in the parse context.
> This is more useful because the user can then trigger some special logic for 
> handling empty files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to