[ 
https://issues.apache.org/jira/browse/TIKA-533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922038#action_12922038
 ] 

Jukka Zitting commented on TIKA-533:
------------------------------------

The magic byte pattern we added in TIKA-402 for detecting iWork documents seems 
to be too eager, as it matches also this document. Note that the test file 
being itself a part of a zip archive makes no difference; it is detected as 
application/vnd.apple.iwork even as a standalone document.

I removed the application/vnd.apple.iwork magic byte pattern in revision 
1023712, which should solve your problem. It looks like we should instead use a 
container-aware detector also for iWork, as I had to also disable a few iWork 
test cases that would no longer correctly detect the format.

> Mis-detection of zip-within-zip as application/vnd.apple.iwork, with no 
> output by CLI app
> -----------------------------------------------------------------------------------------
>
>                 Key: TIKA-533
>                 URL: https://issues.apache.org/jira/browse/TIKA-533
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.8
>         Environment: Windows 7 64-bit, latest Tika build as of 18th Oct 2010.
>            Reporter: Geoff Jarrad
>         Attachments: zip-within-zip.zip
>
>
> It appears that, at least in some circumstances, a zip file containing only 
> another zip file is being mis-detected as  application/vnd.apple.iwork.
> In addition, for such files, the command-line parser does not return any 
> output at all.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to