Hi Tyler, Is there anyway to test if newest version of tika is working on Nutch or not?
On Wednesday, February 18, 2015, Tyler Palsulich <[email protected]> wrote: > If you have gdal and Tesseract installed locally, they will be run against > (eligible) parsed files in Tika. There shouldn't be any required > configuration on the Nutch side. > > Please see http://wiki.apache.org/tika/TikaOCR and > http://wiki.apache.org/tika/TikaGDAL for how to install/run them. > > Hope that helps, > Tyler > > On Wed, Feb 18, 2015 at 5:24 PM, Nikunj Gala <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > >> The current source of Nutch uses Tika 1.7 as per repository in github. ( >> https://github.com/apache/nutch/commit/3e2e688bd097727f457f1aa882c74a128f0a53da >> ) >> As per Apache Tika 1.7 webpage, Tika 1.7 includes GDAL and Tesseract OCR >> (installation required). >> But the Nutch source does not have GDAL and Tesseract OCR in parse-tika >> plugin. >> >> How to include GDAL and Tesseract OCR sources in Tika plugin for Nutch? >> > >

