Hi Bertrand, > About the freedesktop.org stuff, if it's indeed GPL (or even LGPL) we cannot > redistribute it, see http://people.apache.org/~cliffs/3party.html (not > official policy but AFAIK all the info in there is valid).
Ah, ok I hear ya on that. Okey dok, so here's my suggestion then. I propose that we remove the file freedesktop.org.xml, and then rename the DTD file from freedesktop.org.dtd to mime.types.dtd. That way, we remove the freedesktop.org specific stuff, and we're simply using the data model for how to structure the mime database. We can keep the file tika-mimetypes.xml b/c it's derived from the Nutch mime database (apache-licenesed). This is consistent with the approach taken in Nutch w.r.t. to the plugin system xml format, whose DTD is derived from Eclipse's plugin model. What do you think? > > Regarding question #2, this is something that might be better discussed on the > list, I should have brought that up there. Nothing urgent anyway, I'm fine > with having just one module for now. As an FYI, this was discussed, around a year or so ago, and during the principal inception of the Tika project. Glad that you're ok with the direction though, in any case! :) One last thing about having he conversation on the "list". Are you referring to the dev list? If so, then I think it's ok to have conversations via JIRA issues IMO because all JIRA communication gets copied to tika-dev, so others are made aware of the conversation as well... Thanks! Cheers, Chris > >> Port Nutch (or better) MimeType detection system into Tika >> ---------------------------------------------------------- >> >> Key: TIKA-6 >> URL: https://issues.apache.org/jira/browse/TIKA-6 >> Project: Tika >> Issue Type: New Feature >> Components: general >> Affects Versions: 0.1-incubator >> Environment: Improvement is indep. of environment >> Reporter: Chris A. Mattmann >> Assignee: Chris A. Mattmann >> Fix For: 0.1-incubator >> >> Attachments: TIKA-6.Mattmann.091907.patch.txt >> >> >> This patch will contribute a MimeType detection system for Tika, including >> MImeType data structure, and associated content-detection facilities. This >> will be based on Nutch's MimeType system as a baseline, however, I'm open to >> suggestions. Jerome Charron mentioned that he had an implementation of a >> MimeType system based on FreeDesktop.org's system. We should look into this >> as well. ______________________________________________ Chris Mattmann, Ph.D. [EMAIL PROTECTED] Cognizant Development Engineer Early Detection Research Network Project _________________________________________________ Jet Propulsion Laboratory Pasadena, CA Office: 171-266B Mailstop: 171-246 _______________________________________________________ Disclaimer: The opinions presented within are my own and do not reflect those of either NASA, JPL, or the California Institute of Technology.
