Is there a solution to this,  or am I going to have to dig into the 
sources?    Thanks!

[the actual TIF is nothing you'd ever want to OCR but the error below 
impedes batch conversion of the document]

$ file in.tif
in.tif: TIFF image data, little-endian, direntries=16, height=2558, bps=1, 
compression=none, PhotometricIntepretation=BlackIsZero, 
orientation=upper-left, width=1667

$ tesseract in.tif out -l eng pdf
Tesseract Open Source OCR Engine v3.04.01 with Leptonica
Page 1
Too few characters. Skipping this page
OSD: Weak margin (0.00) for 4 blob text block, but using orientation 
anyway: 0
Error in fopenWriteStream: stream not opened
Error in pixWrite: stream not opened
Error in fopenReadStream: file not found
Error in extractG4DataFromFile: stream not opened to file
Error in l_generateG4Data: datacomp not extracted
Error in pixGenerateCIData: g4 data not made
Error in l_generateCIDataForPdf: file in.tif format is 4; unreadable
Error during processing.

$ tesseract -v
tesseract 3.04.01
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.0) : libpng 1.6.25 : libtiff 
4.0.6 : zlib 1.2.8 : libwebp 0.5.0 : libopenjp2 2.1.0

You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
To post to this group, send email to
Visit this group at
To view this discussion on the web visit
For more options, visit

Reply via email to