Hi Galt,

I've been suffering a very similar problem with some of the text I'm
training, which has several diacritics above and below glyphs. It
isn't infrequent to find quite a few lines of garbage which are some
of the diacritics taking a line, which then causes the following and
preceding lines to not include said diacritics.

Switching to -psm 6 did help very significantly, but I'm not
entirely sure why this would make a difference, and diacritics are
still sometimes associated with the wrong line (though a lot less).

How did you fix the problem in your case? Also, can anybody explain
why -psm 6 makes such a big difference? Does it ensure lines are at
least a certain height, or is it something else?

Thanks

Nick

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to