Hello Shree, today I managed to run the language_metrics utility with Tesseract 4.0.0-beta1 and it worked. Therefore, as long as the command line options of Tesseract executables do not change, pytesstrain is compatible both with Tesseract 3 and Tesseract 4, and possibly with future releases of Tesseract.
Kind regards, Wincent Am Sonntag, 5. Januar 2020 03:50:49 UTC+1 schrieb shree: > > Thanks for the info. It looks like a helpful set of tools. > > Please confirm whether this is for training legacy tesseract and which > versions of tesseract are compatible with it. > > On Sun, Jan 5, 2020, 02:22 Wincent Balin <[email protected] > <javascript:>> wrote: > >> Hi all, >> >> I would like to announce pytesstrain, a collection of Tesseract training >> tools, as well as the underlying library. The tools were created while >> training Tesseract to recognise Akkadian language (stay tuned for more >> posts!), to solve the problems that emerged in the process. >> >> You can install it with pip install pytesstrain. >> >> The PyPI page for the package is https://pypi.org/project/pytesstrain/. >> The GitHub project page is https://github.com/wincentbalin/pytesstrain. >> >> This package contains the tools to create dictionary data (wordlist, bi- >> and unigram lists, etc.), rewrap lines in text files to the specified >> length, collect most frequent recognition errors and dump them into >> unicharambigs file, and to perform recognition metrics (WER and CER). It >> also contains the run_test() function, which creates an image file from >> the given string and performs OCR on it afterwards, as well as its >> parallelised version, run_tests(), which can be used in future tools. >> >> Feedback, suggestions, etc would be most welcome. >> >> Yours truly, >> >> Wincent >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/a8162fc0-edb2-4b7d-93b8-f2bb99612f0b%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/a8162fc0-edb2-4b7d-93b8-f2bb99612f0b%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ad9215c2-484b-4901-ac60-9014615e75f5%40googlegroups.com.

