Hi, I'm working on segmenting different languages from an image, so I wonder how tesseract choose the output character when we give multiple languages in the command line.
So far, what I know: - The lstm model in traineddata for different languages are different, I cannot combine the traineddata easily. - The sequence of the language command matters. For example, -eng+fra and -fra+eng will give different results. And the first language passed is set as primary, which affects the output spacing. I would like to know: - How does tesseract choose the output character when it is in different languages? Is it based on the confidence score? And how does the "primary" play a role in generating the output? Thank you! Layne ps. I posted the same content early today but could not see my post showing in the group. Appreciate someone could tell me the reason. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2d5e257e-3ebc-4d47-bbc4-2ba40bd5f35d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

