Tesseract cannot read PDF (which is a document format) directly. You'll 
need to convert it to an image format first.

On Thursday, June 23, 2016 at 7:12:13 PM UTC-5, John Muccigrosso wrote:
>
> Recently installed tesseract and am having some trouble with PDFs. The 
> error is some form of:
>
> Error in fopenReadStream: file not found
> %���� in pixRead: image file not found: %PDF-1.3
> %���� cannot be read!
> Error during processing.
>
> where the 1.3 may be 1.4 or 1.6. Things are fine with a jpg or tiff 
> version of the same PDF (created by exporting from Preview.app).
>
> System: Mac OS X 10.9.5.
> "tesseract -v" reports:
>
> tesseract 3.04.01
>  leptonica-1.72
>   libjpeg 8d : libpng 1.6.23 : libtiff 4.0.6 : zlib 1.2.5
>
>
> I installed tesseract and leptonica with homebrew and "brew info 
> tesseract" reports:
>
> tesseract: stable 3.04.01 (bottled), HEAD
> OCR (Optical Character Recognition) engine
> https://github.com/tesseract-ocr/
> /usr/local/Cellar/tesseract/3.04.01_1 (93 files, 39.5M) *
>   Poured from bottle on 2016-05-27 at 15:41:15
> From: https://
> github.com/Homebrew/homebrew-core/blob/master/Formula/tesseract.rb
> ==> Dependencies
> Required: leptonica ✔
> Recommended: libtiff ✔
> ==> Options
> --with-all-languages
>  Install recognition data for all languages
> --with-opencl
>  Enable OpenCL support
> --with-training-tools
>  Install OCR training tools
> --without-libtiff
>  Build without libtiff support
> --HEAD
>  Install HEAD version
>
>
> I suspect some missing package or something similar, but don't know what 
> exactly.
>
> TIA.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/9981de31-434e-4c7f-a184-e55af1833ec0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to