Thanks for your reply.
In another case, I use tesseract to recognize Chinese characters.
Some Chinese character is recognized as other wrong Chinese character, though 
they are very different in apperance.
The Chinese character has many(dense) strokes is the reason ?
In this case, detecting ROI is helpless.
My question is which tess variables control the classifier match metrics ?
I want to tune these tess variables to solve this problem or improve the reject 
rate.
Best regards
2011-03-29 



liuguanqiang 



发件人: Dmitri Silaev 
发送时间: 2011-03-27  05:36:01 
收件人: tesseract-ocr 
抄送: liuguanqiang 
主题: Re: tesseract improve the reject rate ? 
When you have a small trained alphabet, Tesseract's classifier
sometimes might not find suitable matches and in that way it will
output a null character further converted to a space. However in your
case, there are Chinese characters that have many strokes and
outlines, many of which somehow (partially) match the characters from
your whitelist. So be ready for a quantity of false detections even
when your alphabet is small, i.e. you train Tess to get only digits.
The best approach would be to determine locations where regions of
interest (ROIs) are located, and then run the recognition over them,
using appropriate whitelists.
Warm regards,
Dmitri Silaev
On Sat, Mar 26, 2011 at 8:44 AM, liuguanqiang <[email protected]> wrote:
> hi:
> I use tesseract recognize digital(setwhitelist"0123456789") using
> eng.traineddata.
> There is some other character set(Chinese) in the test image, but the
> tesseract recognize the chinese char  to digital.
> Is there some tess variables to control this situation? Is this problem
> equals " improve the reject rate "?
> The following picture(binary) is recognized as "5221555255", how to let the
> tesseract output null?
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to