Hi,

I have a question about the Tesseract OCR Parser which is part of Tika:
Is it possible to define the output of tesseract to PDF format. I think tesseract supports this option to convert a image file (e.g. tif) into a searchable pdf file:

$ tesseract --tessdata-dir ./ ./testing/eurotext.png ./testing/eurotext-eng -l eng pdf

I use the tika Rest API and I wonder how I can tell tell the Tika Server to create a PDF output file?


Thanks for any help


Ralph


Reply via email to