I have been tasked by my boss of finding out if Nutch indexes content in an
image in a pdf document via OCR and then recognize it as text. So in other
words, if someone uploads a PDF document to our site, and the PDF document
is of an image that is saved as PDF, will nutch search the text within the
image and then catalog the text as part of that PDF document?


*Does Nutch index content for .PDF image on text format?*

Reply via email to