All, FYI, kind words from a Tika supporter! ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
------ Forwarded Message From: Craig Stires <craig.sti...@gmail.com> Date: Wed, 29 Jul 2009 21:39:14 -0700 To: <ridabenjell...@apache.org>, <mattm...@apache.org> Subject: a new project using tika has begun Hi Rida and Chris, Just want to send in a note of much appreciation for the work you've done (and the others tika contributors, poi, pdf, lucene, the list goes on). Work is underway on a project which feeds off the tika parser, as one of the content providers. Although tika is still in a pre-1.0 stage, it is providing enough content to allow us to avoid delays and keep momentum. Thanks for that! What I am hoping to contribute as we continue, are examples of files that aren't parsing quite correctly, or have the wrong encoding set, etc. This project is running against English and Thai data, and will be moving into Japanese and Chinese sometime next year. So, maybe we will have access to a wider range of asian language files than you might have currently. I wish that we had the technical level to contribute patches, but if there's anything that can be passed along to you to help with test / dev, I'd be happy to do so. Thanks again, and letting you know that your efforts are being put to good use. -Craig ------ End of Forwarded Message