On Mon, 4 Jan 2010 05:09:33 -0800 (PST) Eitan <[email protected]> wrote:
> I am a newbie... > Is there a standard way to extract text from PDF using tesseract-ocr ? Unless your PDF is comprised of images, this is not the way to go. PDF is a document format, not an image format. Use a tool like pdftotext. James -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

