Hi art, In fact my program have did your trick, isolating the character and use -psm 10. However, result haven't get better. I have one question about this. when using -psm 10, what background color should be used? As I suspect the tesseract sometime not knowing whether black or white color is the background, it then get bad result. Is there a option in tesseract for setting background color or text color? I have actually found some parameter related but I dont know what value should be input. For example , the preset value have no much sense to me , why it is '2' for editor_image_text_color ..etc . Really appreciated if you could help. Thank you *name * *value * *description* editor_image_word_bb_color 7 Word bounding box colour editor_image_blob_bb_color 4 Blob bounding box colour editor_image_text_color 2 Correct text colour
ref : http://www.sk-spell.sk.cx/tesseract-ocr-parameters-in-302-version Alex On Friday, April 1, 2016 at 2:35:26 AM UTC+8, Art Rhyno wrote: > > Hi, > > > > Tesseract is detecting the blobs for each character correctly at least. > One trick is to leverage the coordinates of each character for extracting > individual images, invert the colours, and use single character mode (-psm > 10) to do the recognition. I think you have to dig into the API to get the > character coordinates or use the makebox option (e.g. tesseract license.png > license makebox). If you isolate each character, it usually recognizes it, > not something that is recommended for a lot of text but maybe worthwhile in > this case. > > > > art > > > > *From:* [email protected] <javascript:> [mailto: > [email protected] <javascript:>] *On Behalf Of *Alex Szeto > *Sent:* Wednesday, March 30, 2016 11:17 AM > *To:* tesseract-ocr <[email protected] <javascript:>> > *Subject:* [tesseract-ocr] High Error rate even if good quality image and > low noise > > > > I am working on a license plate recognition project, I have trouble in > improve accuracy of OCR. > > Attached is one of the image I used and the result is very poor. > > > > version of tesseract : 3.0.3 > > The command that I used : tesseract Untitled.jpg out -psm 9 > > The result is : SXUSBBB while I am expecting for 5X0S888 > > I have did some experiments and I have found some character pairs are > easily get confused by tesseract. > > for example : '0' become 'U' ; '5' and 'S' ; 'B' and '8' > > > > Is there some methods or parameters I can set so the result can be > improved? > > Thank a lot and I really appreciated any advises. > > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To post to this group, send email to [email protected] > <javascript:>. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/abcbfacf-3491-4b85-87b1-a43e5e4de56f%40googlegroups.com > > <https://groups.google.com/d/msgid/tesseract-ocr/abcbfacf-3491-4b85-87b1-a43e5e4de56f%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/525e45b7-be8d-4f05-beca-a6740661d198%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

