FYI coming back to this thread, the Manning folks said it’s fine to contribute the code! :)
I’ve filed this issue to track it: https://issues.apache.org/jira/browse/TIKA-1562 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: <Mattmann>, Chris Mattmann <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Wednesday, August 13, 2014 at 3:09 PM To: "[email protected]" <[email protected]> Subject: Re: [DISCUSS] Give examples of Parser, Detector, and Translator usage >sure np, my point is (and you'll see in Tika in Action examples) depending >on the package namespace "Example" in the classname may be redundant. > >For example, is org.apache.tika.example.translator.ExampleTranslator >redundant? :) > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Chris Mattmann, Ph.D. >Chief Architect >Instrument Software and Science Data Systems Section (398) >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >Office: 168-519, Mailstop: 168-527 >Email: [email protected] >WWW: http://sunset.usc.edu/~mattmann/ >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Adjunct Associate Professor, Computer Science Department >University of Southern California, Los Angeles, CA 90089 USA >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > >-----Original Message----- >From: Tyler Palsulich <[email protected]> >Reply-To: "[email protected]" <[email protected]> >Date: Wednesday, August 13, 2014 3:45 PM >To: "[email protected]" <[email protected]> >Subject: Re: [DISCUSS] Give examples of Parser, Detector, and Translator >usage > >>I think the *Example names are useful, since then the names can overlap >>with the class they give an example of. For example, TranslatorExample >>should show how to use a Translator. >> >>Tyler >> >> >>On Tue, Aug 12, 2014 at 4:37 PM, Mattmann, Chris A (3980) < >>[email protected]> wrote: >> >>> let's go with o.a.tika.example >>> Class names don't need Example in them. >>> >>> Sound good? >>> >>> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Chris Mattmann, Ph.D. >>> Chief Architect >>> Instrument Software and Science Data Systems Section (398) >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> Office: 168-519, Mailstop: 168-527 >>> Email: [email protected] >>> WWW: http://sunset.usc.edu/~mattmann/ >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> Adjunct Associate Professor, Computer Science Department >>> University of Southern California, Los Angeles, CA 90089 USA >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >>> >>> >>> >>> >>> >>> -----Original Message----- >>> From: Tyler Palsulich <[email protected]> >>> Reply-To: "[email protected]" <[email protected]> >>> Date: Tuesday, August 12, 2014 4:34 PM >>> To: "[email protected]" <[email protected]> >>> Subject: Re: [DISCUSS] Give examples of Parser, Detector, and >>>Translator >>> usage >>> >>> >Woot! Any input on naming conventions for the examples? >>> >Package: org.apache.tika.example. >>> >File/class: *Example.java. >>> > >>> >Methods? >>> > >>> >Tyler >>> >On Aug 12, 2014 9:32 AM, "Mattmann, Chris A (3980)" < >>> >[email protected]> wrote: >>> > >>> >> OK I checked with Manning! We can contribute the source code! :) >>> >> >>> >> I will prepare it as part of the tika-examples package. Woot! >>> >> >>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >> Chris Mattmann, Ph.D. >>> >> Chief Architect >>> >> Instrument Software and Science Data Systems Section (398) >>> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> >> Office: 168-519, Mailstop: 168-527 >>> >> Email: [email protected] >>> >> WWW: http://sunset.usc.edu/~mattmann/ >>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >> Adjunct Associate Professor, Computer Science Department >>> >> University of Southern California, Los Angeles, CA 90089 USA >>> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> -----Original Message----- >>> >> From: <Mattmann>, Chris Mattmann <[email protected]> >>> >> Reply-To: "[email protected]" <[email protected]> >>> >> Date: Thursday, August 7, 2014 2:54 PM >>> >> To: "[email protected]" <[email protected]> >>> >> Subject: Re: [DISCUSS] Give examples of Parser, Detector, and >>>Translator >>> >> usage >>> >> >>> >> >Hey Nick! :) >>> >> > >>> >> >I'd have no problem pinching the code from Tika in Action. I wonder >>>if >>> >> >the Manning folks would mind. >>> >> > >>> >> >I'll reach out to them. >>> >> > >>> >> >Cheers, >>> >> >CHris >>> >> > >>> >> > >>> >> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >> >Chris Mattmann, Ph.D. >>> >> >Chief Architect >>> >> >Instrument Software and Science Data Systems Section (398) >>> >> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> >> >Office: 168-519, Mailstop: 168-527 >>> >> >Email: [email protected] >>> >> >WWW: http://sunset.usc.edu/~mattmann/ >>> >> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >> >Adjunct Associate Professor, Computer Science Department >>> >> >University of Southern California, Los Angeles, CA 90089 USA >>> >> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> >-----Original Message----- >>> >> >From: Nick Burch <[email protected]> >>> >> >Reply-To: "[email protected]" <[email protected]> >>> >> >Date: Thursday, August 7, 2014 2:42 PM >>> >> >To: "[email protected]" <[email protected]> >>> >> >Subject: Re: [DISCUSS] Give examples of Parser, Detector, and >>> >>Translator >>> >> >usage >>> >> > >>> >> >>On Thu, 7 Aug 2014, Tyler Palsulich wrote: >>> >> >>> Sounds like the new module is a good idea. So, let's jump on it! >>>I >>> >>will >>> >> >>> create a new 'example' JIRA tag and create issues for creating >>>the >>> >> >>> module and adding Parse, Detect, and Translate examples. Others >>> >>should >>> >> >>> add issues/desired examples as they see fit. How's that sound? >>> >> >> >>> >> >>I wonder if it's worth approaching those crazy fools who wrote a >>>book >>> >>on >>> >> >>Tika, to see if we could pinch one or two of their examples? If >>>only >>> >>we >>> >> >>knew who they were... ;-) >>> >> >> >>> >> >> >>> >> >>Recursion is one that causes confusion, we've got some example >>> >>programs >>> >> >>on >>> >> >>the wiki that we can include: >>> >> >>https://wiki.apache.org/tika/RecursiveMetadata >>> >> >> >>> >> >>Ray Gauss is probably our best bet for advanced metadata stuff to >>> >>send in >>> >> >>some examples on that! >>> >> >> >>> >> >>Another one that has generated mailing list traffic lately is >>>embedded >>> >> >>images, including re-writing links to them. There's some (LGPL) >>>code >>> >>in >>> >> >>Alfresco which I wrote a few years ago to do that, Ray might be >>>able >>> >>to >>> >> >>get the nod to contribute that (or a cut-down version) as an >>>example >>> >>of >>> >> >>that style of parsing html + embedded resources in parallel >>> >> >> >>> >> >>Nick >>> >> > >>> >> >>> >> >>> >>> >
