I guess I am the author... ARYuanB5-MD is the font.
For further background, the stock tessdata_best/chi_tra.traineddata did not do
a good job at all on the text I'm trying to recognize.
So I retrained:
- copying the existing Chinese wordlist and added additional characters and
sentences (tota
Seam like you should put this question to the author of language data
"ARYuanB5-MD"...
Zdenko
ne 15. 10. 2023 o 15:44 'Danny Wilson' via tesseract-ocr <
tesseract-ocr@googlegroups.com> napísal(a):
> Running tesseract on a single Chinese character "對" outputs the character,
> but also the text "
Honestly, this is a very messy configuration for me. Why? Tesseract (and
other projects) use CMake to avoid such manual settings.
Just follow the example in our GitHub action for cmake&windows[1] - it is
simply stupid and it works. Cmake takes care of correct linking
(debug/release), and build (no
Running tesseract on a single Chinese character "對" outputs the character,
but also the text "xlz".
Command line:
tesseract sub0089w.png debugOut -l ARYuanB5-MD --dpi 72 --psm 6 -c
preserve_interword_spaces=1
The output is two lines:
xlz
對
It used to output "sMz" but after retraining sever
4 matches
Mail list logo