I think I may be uniquely qualified to answer this from an Idiot's guide/newish to Tika perspective. :) Apologies if I'm missing out on more obvious answers!
SVN info: http://tika.apache.org/source-repository.html Generally how to contribute (Lucene has a good description): http://wiki.apache.org/lucene-java/HowToContribute POI does too: http://poi.apache.org/guidelines.html If you're adding binary files, I found POI's patch task to be very useful. Grab "patch.xml" from POI's svn and run: ant -f patch.xml -----Original Message----- From: Kai-Uwe Schmidt [mailto:[email protected]] Sent: Thursday, July 11, 2013 10:45 AM To: [email protected] Subject: AW: MagicDetector don't work for all RFC882 message Types. Sorry patch was meant :-/ -----Ursprüngliche Nachricht----- Von: Kai-Uwe Schmidt [mailto:[email protected]] Gesendet: Donnerstag, 11. Juli 2013 16:42 An: [email protected] Betreff: AW: MagicDetector don't work for all RFC882 message Types. Where can I read how to provide a path? -----Ursprüngliche Nachricht----- Von: Nick Burch [mailto:[email protected]] Gesendet: Donnerstag, 11. Juli 2013 12:48 An: [email protected] Betreff: Re: MagicDetector don't work for all RFC882 message Types. On Thu, 11 Jul 2013, Kai-Uwe Schmidt wrote: > I am trying to use Tika to extract metadata from eml's created via > Novell Groupwise. By this I ran into a problem with the dedection of > "message/rfc822". The MagicDetector (working with the default > tika-mimetypes.xml) compares the "match" values binary. RFC822 > describes the header attributes are case independent (see > http://www.ietf.org/rfc/rfc0822.txt 3.4.7). So MIME-Version is the > same than Mime-Version Best bet is to open a bug in jira, and upload a (small!) sample file that shows the problem. We'll need to tweak the mime rules to include that case combination too. (IIRC, the mime magic rules don't support case insensitive matching) Nick
