You need to install 1.8-SNAPSHOT version of Tika in your assignment. Please read the assignment instructions again.
http://sunset.usc.edu/classes/cs572_2015/ Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Nikunj Gala <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Wednesday, February 18, 2015 at 2:24 PM To: "[email protected]" <[email protected]> Subject: Tesseract OCR and GDAL in Tika plugin for Nutch? >The current source of Nutch uses Tika 1.7 as per repository in github. >(https://github.com/apache/nutch/commit/3e2e688bd097727f457f1aa882c74a128f >0a53da) > >As per Apache Tika 1.7 webpage, Tika 1.7 includes GDAL and Tesseract OCR >(installation required). >But the Nutch source does not have GDAL and Tesseract OCR in parse-tika >plugin. > > >How to include GDAL and Tesseract OCR sources in Tika plugin for Nutch? >

