As I always say, send the sample image(s) and describe what you need exactly. Maybe you're looking in the wrong direction.
Warm regards, Dmitri Silaev On Tue, Mar 29, 2011 at 7:34 AM, liuguanqiang <[email protected]> wrote: > Thanks for your reply. > In another case, I use tesseract to recognize Chinese characters. > Some Chinese character is recognized as other wrong Chinese character, > though they are very different in apperance. > The Chinese character has many(dense) strokes is the reason ? > In this case, detecting ROI is helpless. > My question is which tess variables control the classifier match metrics ? > I want to tune these tess variables to solve this problem or > improve the reject rate. > Best regards > 2011-03-29 > ________________________________ > liuguanqiang > ________________________________ > 发件人: Dmitri Silaev > 发送时间: 2011-03-27 05:36:01 > 收件人: tesseract-ocr > 抄送: liuguanqiang > 主题: Re: tesseract improve the reject rate ? > When you have a small trained alphabet, Tesseract's classifier > sometimes might not find suitable matches and in that way it will > output a null character further converted to a space. However in your > case, there are Chinese characters that have many strokes and > outlines, many of which somehow (partially) match the characters from > your whitelist. So be ready for a quantity of false detections even > when your alphabet is small, i.e. you train Tess to get only digits. > The best approach would be to determine locations where regions of > interest (ROIs) are located, and then run the recognition over them, > using appropriate whitelists. > Warm regards, > Dmitri Silaev > On Sat, Mar 26, 2011 at 8:44 AM, liuguanqiang <[email protected]> wrote: >> hi: >> I use tesseract recognize digital(setwhitelist"0123456789") using >> eng.traineddata. >> There is some other character set(Chinese) in the test image, but the >> tesseract recognize the chinese char to digital. >> Is there some tess variables to control this situation? Is this problem >> equals " improve the reject rate "? >> The following picture(binary) is recognized as "5221555255", how to let the >> tesseract output null? >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en. >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

