Oh yeah, here's the output of tessdata -v:

tesseract 5.1.0
 leptonica-1.79.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : libtiff 
4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found OpenMP 201511
 Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 liblz4/1.9.2 
libzstd/1.4.4

On Monday, June 6, 2022 at 11:46:30 AM UTC-5 Lucas L. wrote:

> It seems to be specific to the document in question. However I'm afraid I 
> can't post the document because it has sensitive information on it. I guess 
> I can try to scrub the info using an image editing tool and see if the 
> error still occurs.
>
> On Monday, June 6, 2022 at 11:21:25 AM UTC-5 zdenop wrote:
>
>> Can you please share  ocrIn_1.tif + info which tessdata version you use?
>> + output of 'tesseract -v'
>>
>> Zdenko
>>
>>
>> po 6. 6. 2022 o 17:53 Lucas L. <infinit...@gmail.com> napĂ­sal(a):
>>
>>> Hi, I'm trying to upgrade Tesseract in our Ubuntu 20.04 VMs used to OCR 
>>> documents to Tesseract 5.1 from 4.1.1, both versions were built from source 
>>> on that VM. 4.1.1 worked, but 5.1 throws an error that I can't seem to find 
>>> anywhere else online:
>>>
>>> sudo -u userx tesseract --loglevel ALL --oem 1 -l eng 
>>> /opt/.../pdfprocessor/test/ocr-working/1/ocrIn_1.tif 
>>> /opt/.../pdfprocessor/test/test pdf
>>> Error in selectDefaultPdfEncoding: type selection failure
>>> Error during processing.
>>>
>>> I have tried the training data from both "tessdata" and "tessdata_best" 
>>> and got the same error. Any help would be appreciated.
>>>
>>> Thanks,
>>> Lucas LeBlanc
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to tesseract-oc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/tesseract-ocr/6a8a3c7c-5c09-478e-a897-dca4314646e6n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/tesseract-ocr/6a8a3c7c-5c09-478e-a897-dca4314646e6n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/fbb5cabc-c288-412d-b70a-dc1a6300dc04n%40googlegroups.com.

Reply via email to