[
https://issues.apache.org/jira/browse/TIKA-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182146#comment-15182146
]
Nick Burch commented on TIKA-1882:
----------------------------------
Just because other people think it's a magic doesn't necessarily mean it is -
many others just blindly find a few bytes that look common without trying to
understand the underlying format, and consequently can get it wrong...
As the QuickTime container is a base for MP4, and our MP4 Video mime type
declares QuickTime Video as its parent, if things are common then QuickTime is
the right place to put it.
I've had a go in bee1a87d7d9ad3a1c5f45cf65082b9505dbe9fc0 to better express the
QuickTime/MP4 relationship in the mime types hierarchy. If you could merge that
and re-test, and all tests pass, plus switch hex strings to text where possible
(see pull request comments) then I think we should be fine to apply
> Updating the tika-mimetypes.xml for new mime magic patterns
> -----------------------------------------------------------
>
> Key: TIKA-1882
> URL: https://issues.apache.org/jira/browse/TIKA-1882
> Project: Tika
> Issue Type: Improvement
> Components: mime
> Affects Versions: 1.11
> Reporter: Manisha Kampasi
> Priority: Minor
> Labels: patch
>
> The following mime magic can be added to better detect the below mime-types:
> 1. vnd.ms-cab-compressed (.cab files) - pattern "MCSF" in the first 4 bytes
> 2. application/vnd.xara (.xar files) - pattern "xar!" in the first 4 bytes
> 3. application/x-mobipocket-ebook (.mobi files) - pattern "BOOKMOBI" starting
> at byte position 60
> 4. video/quicktime (.mov files) - patterns "free" and "wide" seen starting at
> byte position 4
> The changes can be seen here:
> https://github.com/mkampasi/tika/commit/f7433daf434a44937ba3ae8b15813a768f95e334
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)