Hi, I have a question about the Tesseract OCR Parser which is part of Tika:Is it possible to define the output of tesseract to PDF format. I think tesseract supports this option to convert a image file (e.g. tif) into a searchable pdf file:
$ tesseract --tessdata-dir ./ ./testing/eurotext.png ./testing/eurotext-eng -l eng pdf
I use the tika Rest API and I wonder how I can tell tell the Tika Server to create a PDF output file?
Thanks for any help Ralph
