[
https://issues.apache.org/jira/browse/TIKA-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529135
]
Jukka Zitting commented on TIKA-6:
----------------------------------
I don't think we can include the freedesktop.org.xml file in Tika. AFAIK the
database originates from
http://www.freedesktop.org/wiki/Software/shared-mime-info and is distributed
under the GPL. What we can do instead is to provide a configuration option
(with a reasonable default) for a user to point Tika to the mime database file
already available on a system. This way we don't need to include the viral
component within Tika releases.
Some of the source files have the following license header. I guess it's an
oversight and easily resolved.
//Copyright (c) 2007, California Institute of Technology.
//ALL RIGHTS RESERVED. U.S. Government sponsorship acknowledged.
Also, we should update the license headers to the latest version available at
http://www.apache.org/legal/src-headers.html.
> Port Nutch (or better) MimeType detection system into Tika
> ----------------------------------------------------------
>
> Key: TIKA-6
> URL: https://issues.apache.org/jira/browse/TIKA-6
> Project: Tika
> Issue Type: New Feature
> Components: general
> Affects Versions: 0.1-incubator
> Environment: Improvement is indep. of environment
> Reporter: Chris A. Mattmann
> Assignee: Chris A. Mattmann
> Fix For: 0.1-incubator
>
> Attachments: TIKA-6.Mattmann.091907.patch.txt
>
>
> This patch will contribute a MimeType detection system for Tika, including
> MImeType data structure, and associated content-detection facilities. This
> will be based on Nutch's MimeType system as a baseline, however, I'm open to
> suggestions. Jerome Charron mentioned that he had an implementation of a
> MimeType system based on FreeDesktop.org's system. We should look into this
> as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.