[tesseract-ocr] The Output Using Multiple Languages

Layne Wang Thu, 07 Jun 2018 01:36:01 -0700

Hi,
I'm using Tesseract 4.0.0-alpha on Ubuntu 16.04.
I refer to 
https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage, *Using 
Multiple Languages* section.
In the wiki, it says the sequence of  the arg <lang1+lang2> matters the 
output, and there is a priority for these languages.


My questions are

   - What does "primary language" mean? I know it will affect the spacing 
   and probably which character to output, but I'm not sure how it really 
   works.
   - How does tesseract choose the 'best' character among all the 
   languages? Is it based on the confidence/score? And how does the sequence 
   of the <lang1+lang2> arg affect the output?

Thanks in advance!
Layne

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/5704e058-e985-4763-829c-8413abb32b4a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] The Output Using Multiple Languages

Reply via email to