I¹ll try and combine mine and Tyler¹s patch for 1422 and see if it fixes it :) Will test today.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Tyler Palsulich <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Tuesday, October 7, 2014 at 1:49 AM To: "[email protected]" <[email protected]> Subject: Re: Tesseract OCR always activeated parser for images >Confirmed. This is why we ran into TIKA-1422. But, Chris' patch may >provide >the backwards compatibility you're looking for. What do you think? > >Tyler > >On Mon, Oct 6, 2014 at 7:47 PM, Lewis John Mcgibbney < >[email protected]> wrote: > >> Hi Folks, >> Now, once I install Tesseract, it is run for every image I pass through >> Tika server or Tika app. >> This is not okay as it does not give me the type of MD I am looking for. >> This is a just a note to folks, to say that AFAIK you would need to >> unregister the the parser from [0] then rebuild from source in order to >> maintain backwards compatability in this regard. >> Before I log a ticket for this, can anyone else confirm this please? >> Thanks >> Lewis >> >> [0] >> >> >>https://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/resourc >>es/META-INF/services/org.apache.tika.parser.Parser >> >> -- >> *Lewis* >>
