OK I checked with Manning! We can contribute the source code! :) I will prepare it as part of the tika-examples package. Woot!
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: <Mattmann>, Chris Mattmann <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Thursday, August 7, 2014 2:54 PM To: "[email protected]" <[email protected]> Subject: Re: [DISCUSS] Give examples of Parser, Detector, and Translator usage >Hey Nick! :) > >I'd have no problem pinching the code from Tika in Action. I wonder if >the Manning folks would mind. > >I'll reach out to them. > >Cheers, >CHris > > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Chris Mattmann, Ph.D. >Chief Architect >Instrument Software and Science Data Systems Section (398) >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >Office: 168-519, Mailstop: 168-527 >Email: [email protected] >WWW: http://sunset.usc.edu/~mattmann/ >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Adjunct Associate Professor, Computer Science Department >University of Southern California, Los Angeles, CA 90089 USA >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > >-----Original Message----- >From: Nick Burch <[email protected]> >Reply-To: "[email protected]" <[email protected]> >Date: Thursday, August 7, 2014 2:42 PM >To: "[email protected]" <[email protected]> >Subject: Re: [DISCUSS] Give examples of Parser, Detector, and Translator >usage > >>On Thu, 7 Aug 2014, Tyler Palsulich wrote: >>> Sounds like the new module is a good idea. So, let's jump on it! I will >>> create a new 'example' JIRA tag and create issues for creating the >>> module and adding Parse, Detect, and Translate examples. Others should >>> add issues/desired examples as they see fit. How's that sound? >> >>I wonder if it's worth approaching those crazy fools who wrote a book on >>Tika, to see if we could pinch one or two of their examples? If only we >>knew who they were... ;-) >> >> >>Recursion is one that causes confusion, we've got some example programs >>on >>the wiki that we can include: >>https://wiki.apache.org/tika/RecursiveMetadata >> >>Ray Gauss is probably our best bet for advanced metadata stuff to send in >>some examples on that! >> >>Another one that has generated mailing list traffic lately is embedded >>images, including re-writing links to them. There's some (LGPL) code in >>Alfresco which I wrote a few years ago to do that, Ray might be able to >>get the nod to contribute that (or a cut-down version) as an example of >>that style of parsing html + embedded resources in parallel >> >>Nick >
