Hi again,
I copied over the actual eng-traineddata (14.6MB) file now. It does not have a problem with opening it anymore. It goes further down the training process but fails with the Python installation, it requires Pillow. My next problem is, when I use Python, I usually use a virtual environment, also because of this reason. I don't know how to edit the existing python installation, i.e.: adding libraries to it. I use Ubuntu. I might have to read up on that next I guess... . Thanks so far though, copying the file manually worked. Kind regards, Shavkat Sultanov Shavkat Sultanov schrieb am Mittwoch, 7. Januar 2026 um 09:29:12 UTC+1: > Hi again Zdenko, > > > you are right though, about the file size. why does wget get me a smaller > file and what do I do? if I download it on my windows, it gets me the 14.6 > MB file. can I get it somehow from github to my linux, or do I have to > download it here and transfer it over manually somehow? > > Please help! > > > Kind regards, > Shavkat Sultanov > > > zdenop schrieb am Mittwoch, 7. Januar 2026 um 08:36:24 UTC+1: > >> You mentioned: >> >>> I downloaded it from here: >>> https://github.com/tesseract-ocr/tessdata_best/blob/main/eng.traineddata >> >> >> But the github file has 14.7 MB, your has 195584 bytes... What did you >> download? >> >> >> Zdenko >> >> >> st 7. 1. 2026 o 8:26 Shavkat Sultanov <[email protected]> napĂsal(a): >> >>> [image: Screenshot 2026-01-07 082509.png] >>> if this helps ... >>> >>> it may be a my machine issue then? I don't know linux that well yet, I >>> am sorry. >>> >>> -- >>> >> You received this message because you are subscribed to the Google Groups >>> "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion visit >>> https://groups.google.com/d/msgid/tesseract-ocr/126edcac-0f64-47e0-9f8f-0c56a0d60eddn%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/tesseract-ocr/126edcac-0f64-47e0-9f8f-0c56a0d60eddn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/3b59b56f-ddd9-461e-808d-f9172d619bffn%40googlegroups.com.

