Hello Shree,

today I managed to run the language_metrics utility with Tesseract 
4.0.0-beta1 and it worked. Therefore, as long as the command line options 
of Tesseract executables do not change, pytesstrain is compatible both with 
Tesseract 3 and Tesseract 4, and possibly with future releases of Tesseract.

Kind regards,

Wincent


Am Sonntag, 5. Januar 2020 03:50:49 UTC+1 schrieb shree:
>
> Thanks for the info. It looks like a helpful set of tools.
>
> Please confirm whether this is for training legacy tesseract and which 
> versions of tesseract are compatible with it.
>
> On Sun, Jan 5, 2020, 02:22 Wincent Balin <[email protected] 
> <javascript:>> wrote:
>
>> Hi all,
>>
>> I would like to announce pytesstrain, a collection of Tesseract training 
>> tools, as well as the underlying library. The tools were created while 
>> training Tesseract to recognise Akkadian language (stay tuned for more 
>> posts!), to solve the problems that emerged in the process.
>>
>> You can install it with pip install pytesstrain.
>>
>> The PyPI page for the package is https://pypi.org/project/pytesstrain/. 
>> The GitHub project page is https://github.com/wincentbalin/pytesstrain.
>>
>> This package contains the tools to create dictionary data (wordlist, bi- 
>> and unigram lists, etc.), rewrap lines in text files to the specified 
>> length, collect most frequent recognition errors and dump them into 
>> unicharambigs file, and to perform recognition metrics (WER and CER). It 
>> also contains the run_test() function, which creates an image file from 
>> the given string and performs OCR on it afterwards, as well as its 
>> parallelised version, run_tests(), which can be used in future tools.
>>
>> Feedback, suggestions, etc would be most welcome.
>>
>> Yours truly,
>>
>> Wincent
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/a8162fc0-edb2-4b7d-93b8-f2bb99612f0b%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/a8162fc0-edb2-4b7d-93b8-f2bb99612f0b%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ad9215c2-484b-4901-ac60-9014615e75f5%40googlegroups.com.

Reply via email to