Hey Lewis, On Apr 17, 2012, at 3:35 AM, Lewis John Mcgibbney wrote:
> 3) We previously discussed implementing the Any23 parser plugin as a tika > wrapper, therefore it would look very similar to parse-tika? I think it would be super awesome to add the Any23 parsing functionality as a Tika parser, and potentially an extension to the MIME repository to detect microformats, etc. Then in Nutch, we could take advantage of the any23 parser with the existing tika-parser interface. Thoughts? Thanks! Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++