[
https://issues.apache.org/jira/browse/TIKA-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933337#comment-13933337
]
Nick Burch commented on TIKA-1259:
----------------------------------
I don't think we want to use the freedesktop mime ones, firstly as they don't
support hierarchies, secondly as I'm not sure the license is compatible, and
thirdly they seem to be missing some
As of r1577180, I've added details of all the Ogg based audio and video formats
I can find info on. Still need to add a few of the metadata based formats in.
Also, based on research, I've changed the type returned for FLAC-in-Ogg, which
means I've had to disable one bit of a unit test. We'll need to re-enable that
when we upgrade the Vorbis Java library to match.
However, as these are container formats, just looking for magic strings isn't
very reliable, as that just tells you the type of the first packet, which may
not be what you want for the overall file. The only way to reliably detect them
is to open the container and parse it, which means some updates to the Vorbis
Java library provided Detector is needed to go with these new types
> More ogg based mime entries
> ---------------------------
>
> Key: TIKA-1259
> URL: https://issues.apache.org/jira/browse/TIKA-1259
> Project: Tika
> Issue Type: Improvement
> Components: mime
> Affects Versions: 1.5
> Reporter: Nick Burch
> Fix For: 1.6
>
>
> There are a few more ogg based formats for which we don't currently have mime
> entries or magic. Wikipedia is pretty good on giving the mime types and
> descriptions, while /liboggz/oggz_auto.h has the first packet identifier
> strings
--
This message was sent by Atlassian JIRA
(v6.2#6252)