by the way, the sample image I uploaded is a failed case. I cannot get "Mx1251C" recognized correctly, even if I limited the char set as 0-9&A-Z.
On Monday, February 17, 2014 12:59:09 PM UTC+8, Richard Wang wrote: > > hello, > > I am trying to perform OCR on merchandise label with Chinese chars. > Here I have uploaded an example image. The character string I really > care about is "MX1251C". > > Most of such string is composed by digit number and A-Z characters, > so I can configure the range of characters. Also, they are not English > words, so I can disable use of dictionary. > > As you can see, most of such image will contain a lot of Chinese > characters, > but I do not care about them. > > My question is that if the existence of these Chinese characters make > my problem more difficult, compared to the case if they are English > letters. > > If I want to speed up the recognition process and accuracy, how can I take > advantage of the special properties of my problem here? > > thanks. > Richard > > > <https://lh4.googleusercontent.com/-_hlnACBMN5o/UwGViQ65HZI/AAAAAAAAAEs/D6Y9i60SrSE/s1600/IMG_20140215_152033.jpg> > I only need to recognize and extract the style number. > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

