Please see https://github.com/tesseract-ocr/tesseract/issues/1275#issuecomment-360367865
for details of debug variables you can set to see the values of different languages. On Thu, Jun 7, 2018 at 2:06 PM Layne Wang <[email protected]> wrote: > Hi, > > I'm working on segmenting different languages from an image, so I wonder > how tesseract choose the output character when we give multiple languages > in the command line. > > So far, what I know: > > - The lstm model in traineddata for different languages are different, > I cannot combine the traineddata easily. > - The sequence of the language command matters. For example, -eng+fra > and -fra+eng will give different results. And the first language passed is > set as primary, which affects the output spacing. > > I would like to know: > > - How does tesseract choose the output character when it is in > different languages? Is it based on the confidence score? And how does the > "primary" play a role in generating the output? > > Thank you! > Layne > > ps. I posted the same content early today but could not see my post > showing in the group. Appreciate someone could tell me the reason. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/2d5e257e-3ebc-4d47-bbc4-2ba40bd5f35d%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/2d5e257e-3ebc-4d47-bbc4-2ba40bd5f35d%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVy4K2FD1bN3WA4poheAyeA%2B80xj2_M4Rq9PTNQCTcjRQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

