Parser checker

Sent from my iPhone

On Feb 18, 2015, at 3:03 PM, Jiaxin Ye 
<[email protected]<mailto:[email protected]>> wrote:

Hi Tyler,

Is there anyway to test if newest version of tika is working on Nutch or not?


On Wednesday, February 18, 2015, Tyler Palsulich 
<[email protected]<mailto:[email protected]>> wrote:
If you have gdal and Tesseract installed locally, they will be run against 
(eligible) parsed files in Tika. There shouldn't be any required configuration 
on the Nutch side.

Please see http://wiki.apache.org/tika/TikaOCR and 
http://wiki.apache.org/tika/TikaGDAL for how to install/run them.

Hope that helps,
Tyler

On Wed, Feb 18, 2015 at 5:24 PM, Nikunj Gala 
<[email protected]<javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
The current source of Nutch uses Tika 1.7 as per repository in github. 
(https://github.com/apache/nutch/commit/3e2e688bd097727f457f1aa882c74a128f0a53da)
As per Apache Tika 1.7 webpage, Tika 1.7 includes GDAL and Tesseract OCR 
(installation required).
But the Nutch source does not have GDAL and Tesseract OCR in parse-tika plugin.

How to include GDAL and Tesseract OCR sources in Tika plugin for Nutch?

Reply via email to