I've got Tika working with Tesseract on PDF files, but it seems that if I give 
it a PDF file that has both searchable text and images, the text is OCRed 
twice.  Is there a way to avoid this?  Even if it has to make two passes, one 
for the straight text and then another for just the images

Reply via email to