[ 
https://issues.apache.org/jira/browse/TIKA-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Burch resolved TIKA-1502.
------------------------------
       Resolution: Fixed
    Fix Version/s: 1.7

In r1647489 I've re-ordered the MediaTypeRegistry logic for getting the super 
type, so that if an explicit inheritance has been defined between one 
parametered type and another, that inheritance is used in preference to "drop 
all parameters"

That means that the supertype fetching for something defined in the mimetypes 
file can go like:

application/x-berkeley-db;format=hash;version=2
to
application/x-berkeley-db;format=hash
to
application/x-berkeley-db

However, for parameters unknown to the mime types file, the behaviour remains 
things like

text/plain; charset=UTF-8
to
text/plain

> Mime magic for database file formats
> ------------------------------------
>
>                 Key: TIKA-1502
>                 URL: https://issues.apache.org/jira/browse/TIKA-1502
>             Project: Tika
>          Issue Type: Improvement
>          Components: mime
>    Affects Versions: 1.6
>            Reporter: Nick Burch
>             Fix For: 1.7
>
>
> I noticed today that Tika can't detect a lot of common database formats, such 
> as sqlite or Berkeley DB or MISAM
> The unix file utility got most of those, which makes me think that there's a 
> sensible-ish header on most we can write some mime magic for
> It'd therefore be good to add mime entries, with magic where possible, for 
> many of these common database file formats



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to