Hi, I keep having problems with duplicated letters with custom fine-tuned models.
For example an M becomes MH. I'm using ocrd-train with actual crops and I noticed that the lstmf files are generated with psm 6. At runtime I use psm 7. Do you think this may make a difference? From a quick test it does not seem the case. The problem gets worse if I use psm 13 for recognition this is why I'm wondering if there is a relation. Is there something else that I'm doing wrong that might lead to this problem? Or something I can improve? I have only one font (ocr-b) with fixed height (44px plus 2px white margin). According to this post the sweet spot seems to be closer to 30px (for most fonts) https://groups.google.com/forum/?#!msg/tesseract-ocr/Wdh_JJwnw94/cHjYD3cDEQAJ Thanks, bye Lorenzo -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLwi%3DywrYE%3D_z4J4%3D_ZpxDLAtw8vEFS27nUzLBuHASBsUg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

