[tesseract-ocr] How to output multiple matches/results instead of just one?

shutupyoudontknowitall Thu, 05 May 2016 05:23:10 -0700


I want to use tesseract to convert images of a single Chinese/Japanese 
character which will be handwritten by a client-side user on an html5 
canvas.



After several tests, I find Tesseract's recognition of handwritten chinese 
characters to be very good, but sometimes the result is slightly off. 
 Sometimes I get the correct 木 character as output, but sometimes I get the 
slightly different 本 character.


How can I make Tesseract output something like a list of top 10 best 
matches, one of which will likely be the desired result, instead of just 
outputting one incorrect result?


I want it to do something like:


Command: tesseract tree.gif out -psm 10 -l jpn  ---->  Output (best 
matches):  本, 木, 休, 十, 八, 六,.......etc


An example image I'm using is below.

<https://lh3.googleusercontent.com/-wEN7FNMLGbc/VysTLEjen-I/AAAAAAAAAAM/iR2d7jORK4MjZuCgQGhqZIUc7_0dEyC5ACLcB/s1600/tree.gif>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/69886fac-5c46-4eeb-aeb8-cc1d2d936924%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] How to output multiple matches/results instead of just one?

Reply via email to