[ 
https://issues.apache.org/jira/browse/TIKA-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342347#comment-14342347
 ] 

Nick Burch commented on TIKA-289:
---------------------------------

As of r1663136, you can now run the Tika CLI with the option 
{{--compare-file-magic=<dir>}} to have the Tika mime types compared to a 
File(1) magic directory. This will report the mime types known to File(1) but 
not Tika, and the ones that File(1) has magic but Tika doesn't, plus some 
summary statistics

Hopefully others can use that soon-ish to add in some of the missing types, and 
missing magics for known types. Longer term, we can use it to track when 
File(1) adds new types we might want to add in too

> Add magic byte patterns from file(1)
> ------------------------------------
>
>                 Key: TIKA-289
>                 URL: https://issues.apache.org/jira/browse/TIKA-289
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime
>            Reporter: Jukka Zitting
>            Priority: Minor
>         Attachments: file-has-magic-tika-missing.txt, file-mimes-missing.txt
>
>
> As discussed in TIKA-285, the file(1) command comes with a pretty 
> comprehensive set of magic byte patterns. It would be nice to get those 
> patterns included also in Tika.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to