It took some digging but this issue has been resolved. I am reporting back
to this list because a few people have expressed interest.
At Larry Stone's suggestion, I verified that pdftotext (part of xpdf) was
able to extract text from my scanned PDF. I also re-ORCed the PDFs using
Acrobat 8 Pro,
I just created a collection of 72 PDFs, mostly from scanned image files, but
with several born digital files too. I was disappointed to learn that
PDFbox was unable to process the scanned documents even though they contain
searchable text. The files were created using a third-party OCR tool, but
2 matches
Mail list logo