Nick,
I'm sorry that I missed your response (wound up in my spambox). I'd be happy
to draft a section on how to contribute for Tika's website. How do I
contribute that? Open an issue and submit html? Should I create a separate
html or modify the http://tika.apache.org/source-repository.html site?
Thank you.
Best,
Tim
-----Original Message-----
From: Allison, Timothy B. [mailto:[email protected]]
Sent: Thursday, July 11, 2013 10:53 AM
To: [email protected]
Subject: RE: MagicDetector don't work for all RFC882 message Types.
I think I may be uniquely qualified to answer this from an Idiot's guide/newish
to Tika perspective. :) Apologies if I'm missing out on more obvious answers!
SVN info:
http://tika.apache.org/source-repository.html
Generally how to contribute (Lucene has a good description):
http://wiki.apache.org/lucene-java/HowToContribute
POI does too:
http://poi.apache.org/guidelines.html
If you're adding binary files, I found POI's patch task to be very useful.
Grab "patch.xml" from POI's svn and run:
ant -f patch.xml
-----Original Message-----
From: Kai-Uwe Schmidt [mailto:[email protected]]
Sent: Thursday, July 11, 2013 10:45 AM
To: [email protected]
Subject: AW: MagicDetector don't work for all RFC882 message Types.
Sorry patch was meant :-/
-----Ursprüngliche Nachricht-----
Von: Kai-Uwe Schmidt [mailto:[email protected]]
Gesendet: Donnerstag, 11. Juli 2013 16:42
An: [email protected]
Betreff: AW: MagicDetector don't work for all RFC882 message Types.
Where can I read how to provide a path?
-----Ursprüngliche Nachricht-----
Von: Nick Burch [mailto:[email protected]]
Gesendet: Donnerstag, 11. Juli 2013 12:48
An: [email protected]
Betreff: Re: MagicDetector don't work for all RFC882 message Types.
On Thu, 11 Jul 2013, Kai-Uwe Schmidt wrote:
> I am trying to use Tika to extract metadata from eml's created via
> Novell Groupwise. By this I ran into a problem with the dedection of
> "message/rfc822". The MagicDetector (working with the default
> tika-mimetypes.xml) compares the "match" values binary. RFC822
> describes the header attributes are case independent (see
> http://www.ietf.org/rfc/rfc0822.txt 3.4.7). So MIME-Version is the
> same than Mime-Version
Best bet is to open a bug in jira, and upload a (small!) sample file that shows
the problem. We'll need to tweak the mime rules to include that case
combination too. (IIRC, the mime magic rules don't support case insensitive
matching)
Nick