integrating tika into hadoop and tika with tesseract.

chethan Thu, 24 Nov 2011 23:29:43 -0800

hi,

as i am new to tika, i want to know following things.


1. how to integrate tika within hadoop, so that tika will use map
reduce to implement the parsing.
2. we wanted tika to parse ocr files too...but as tika is not
supporting ocr parsing and also recommending to use tesseract, i want
to
   know how to call tesseract ( command line operation ) through tika
( which in-turn uses map reduce to parse ocr files ).

thanks and regards
chethan

integrating tika into hadoop and tika with tesseract.

Reply via email to