Please show minimal respect and first google for a solution.

Zdenko


pi 7. 6. 2024 o 18:23 Fred Andrews <[email protected]> napĂ­sal(a):

> I captured a screenshot of a VirtualBox guest boot crash and Tesseract
> didn't seem to do very well OCRing that text, so I wanted to try the older
> engine, which the help says should be possible by using "--oem 0".
> However, this doesn't work:
>
> D:\temp\virtualbox-project>"c:\Program Files\Tesseract-OCR\tesseract.exe"
> vb-crash.png output --oem 0
> Error: Tesseract (legacy) engine requested, but components are not present
> in c:\Program Files\Tesseract-OCR/tessdata/eng.traineddata!!
> Failed loading language 'eng'
> Tesseract couldn't load any languages!
> Could not initialize tesseract.
>
> But, I installed Tesseract 5.4.0 using the prebuilt binary:
>
> https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-5.4.0.20240606.exe
> and so that file IS present at the location claimed:
>
> c:\Program Files\Tesseract-OCR\tessdata>dir
>  Volume in drive C is DESKOS
>  Volume Serial Number is EA89-635E
>
>  Directory of c:\Program Files\Tesseract-OCR\tessdata
>
> 06/07/2024  10:59 AM    <DIR>          .
> 06/07/2024  10:59 AM    <DIR>          ..
> 06/07/2024  04:50 AM    <DIR>          configs
> 06/06/2024  09:18 AM         4,113,088 eng.traineddata
> 01/16/2019  03:53 PM                33 eng.user-patterns
> 01/16/2019  03:53 PM                27 eng.user-words
> 06/06/2024  09:19 AM           128,076 jaxb-api-2.3.1.jar
> 06/06/2024  09:18 AM        10,562,727 osd.traineddata
> 06/06/2024  09:41 AM               572 pdf.ttf
> 06/06/2024  09:19 AM           125,187 piccolo2d-core-3.0.1.jar
> 06/06/2024  09:19 AM           149,558 piccolo2d-extras-3.0.1.jar
> 06/07/2024  04:50 AM    <DIR>          script
> 06/06/2024  09:19 AM            26,376 ScrollView.jar
> 06/07/2024  04:50 AM    <DIR>          tessconfigs
>                9 File(s)     15,105,644 bytes
>                5 Dir(s)  1,600,415,711,232 bytes free
>
> So it looks like either paths aren't being handled properly on Windows
> (note the use of forward slashes in the output), or somehow the old engine
> expects a different format than the eng.traineddata installed with 5.4.0
>
> Should I attempt to file an issue on the Mannheim Github site?
> https://github.com/UB-Mannheim/tesseract
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/d7361b3a-a338-4a27-b1f3-0914160b0ff3n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/d7361b3a-a338-4a27-b1f3-0914160b0ff3n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zhC2ZZej_TD2HWQ1yECw2hJWaaTDHmkd4%2BEgTLpChVTg%40mail.gmail.com.

Reply via email to