Thanks for the info. It looks like a helpful set of tools. Please confirm whether this is for training legacy tesseract and which versions of tesseract are compatible with it.
On Sun, Jan 5, 2020, 02:22 Wincent Balin <[email protected]> wrote: > Hi all, > > I would like to announce pytesstrain, a collection of Tesseract training > tools, as well as the underlying library. The tools were created while > training Tesseract to recognise Akkadian language (stay tuned for more > posts!), to solve the problems that emerged in the process. > > You can install it with pip install pytesstrain. > > The PyPI page for the package is https://pypi.org/project/pytesstrain/. > The GitHub project page is https://github.com/wincentbalin/pytesstrain. > > This package contains the tools to create dictionary data (wordlist, bi- > and unigram lists, etc.), rewrap lines in text files to the specified > length, collect most frequent recognition errors and dump them into > unicharambigs file, and to perform recognition metrics (WER and CER). It > also contains the run_test() function, which creates an image file from > the given string and performs OCR on it afterwards, as well as its > parallelised version, run_tests(), which can be used in future tools. > > Feedback, suggestions, etc would be most welcome. > > Yours truly, > > Wincent > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/a8162fc0-edb2-4b7d-93b8-f2bb99612f0b%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/a8162fc0-edb2-4b7d-93b8-f2bb99612f0b%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUAs3WDOyvcC5dMC0MGEMw060OPymisS6m7%2B9A5zm4nzg%40mail.gmail.com.

