Please show minimal respect and first google for a solution.
Zdenko pi 7. 6. 2024 o 18:23 Fred Andrews <[email protected]> napĂsal(a): > I captured a screenshot of a VirtualBox guest boot crash and Tesseract > didn't seem to do very well OCRing that text, so I wanted to try the older > engine, which the help says should be possible by using "--oem 0". > However, this doesn't work: > > D:\temp\virtualbox-project>"c:\Program Files\Tesseract-OCR\tesseract.exe" > vb-crash.png output --oem 0 > Error: Tesseract (legacy) engine requested, but components are not present > in c:\Program Files\Tesseract-OCR/tessdata/eng.traineddata!! > Failed loading language 'eng' > Tesseract couldn't load any languages! > Could not initialize tesseract. > > But, I installed Tesseract 5.4.0 using the prebuilt binary: > > https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-5.4.0.20240606.exe > and so that file IS present at the location claimed: > > c:\Program Files\Tesseract-OCR\tessdata>dir > Volume in drive C is DESKOS > Volume Serial Number is EA89-635E > > Directory of c:\Program Files\Tesseract-OCR\tessdata > > 06/07/2024 10:59 AM <DIR> . > 06/07/2024 10:59 AM <DIR> .. > 06/07/2024 04:50 AM <DIR> configs > 06/06/2024 09:18 AM 4,113,088 eng.traineddata > 01/16/2019 03:53 PM 33 eng.user-patterns > 01/16/2019 03:53 PM 27 eng.user-words > 06/06/2024 09:19 AM 128,076 jaxb-api-2.3.1.jar > 06/06/2024 09:18 AM 10,562,727 osd.traineddata > 06/06/2024 09:41 AM 572 pdf.ttf > 06/06/2024 09:19 AM 125,187 piccolo2d-core-3.0.1.jar > 06/06/2024 09:19 AM 149,558 piccolo2d-extras-3.0.1.jar > 06/07/2024 04:50 AM <DIR> script > 06/06/2024 09:19 AM 26,376 ScrollView.jar > 06/07/2024 04:50 AM <DIR> tessconfigs > 9 File(s) 15,105,644 bytes > 5 Dir(s) 1,600,415,711,232 bytes free > > So it looks like either paths aren't being handled properly on Windows > (note the use of forward slashes in the output), or somehow the old engine > expects a different format than the eng.traineddata installed with 5.4.0 > > Should I attempt to file an issue on the Mannheim Github site? > https://github.com/UB-Mannheim/tesseract > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/d7361b3a-a338-4a27-b1f3-0914160b0ff3n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/d7361b3a-a338-4a27-b1f3-0914160b0ff3n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zhC2ZZej_TD2HWQ1yECw2hJWaaTDHmkd4%2BEgTLpChVTg%40mail.gmail.com.

