> I’m not quite sure if I understand the question, but if all you want to do is > pull the text out of an OCR’ed PDF file, then I have found both Tika and > PDFtotext to be useful tools.... > > On the other hand, if you need to do the OCR itself, then employing Tesseract > is probably the way to go.
For clarity, I have to do the OCR itself. I've been using CAM::PDF to extract existing text. Kyle