Hello, I'm working on the same as you, for the licence plates from Argentina, as I live in Argentina.
Same as you described, the problem was to locate the licence plate. Now I'm working with the OCR and then I will work on horizontalizing the images, because if they are not completely horizontal, the OCR fails, for example today I was getting a 5 instead a of a 6. When I horizontalized the image with photoshop, everything turned to ok. I dont know how is the layout of the positions of letters and numbers in California plates, are they assorted ? ...if you know if the character should be a number or a letter according to its position, you have two options (as far as I know): - when recognizing char by char, tell Tesseract that you expect a number or a letter. I saw that in somewere inside the source code, don't remember where. - make your own conversion, e.g., if you are expecting a number and you get a G, map it to a 6, if you expect a 2 map it to a Z. I think that I'll use the last one, I'm not on that part yet. I'm getting good results on images where the characters are big because of the distance of the camera, but in small letters (13 pixels height) things are not good. So I have a pair of ideas to test, perhaps somebody from the group could give me opinions regarding them: - following the contour, with polygon approximation of the chars, making an image with that contours and running Tesseract on that image (trained for that) - make an image with my font (one of each from the alphabet), and repeating the alphabet with different levels of threshold. I think that internally Tesseract thresholds the images. Hard to explain this, but I think that it may improve the quality. If you want to continue speaking about specifics of licence plate recognition, we can continue privately because it's off topic. I'm interested in continuing. There are many things to speak about, for example, the prices of the cameras, light filters, times of execution, etc. You can write me to andrej100 at gmail Regards, Andres 2010/7/28 ZIA <[email protected]> > I am writing a license plate recognition application in C#. I am > almost done, i have started work on my own OCR,but then I decided to > use tessearact-ocr, which now partially works. I provide the > california license plate to ocr, but some of the font, it doesn't > recognizes, for example, like "Z" becomes number 2, letter "O" becomes > "U", and number 4 becomes something else. Any suggestion? any language > file or font file that will solve this issue. Beside that in complex > images, i am having hard time to locate License plate. but my concern > is now on ocr, since i thought i would save time by using tesseract > then writing my own neural network. I would really appreciate any > ideas or suggestions. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<tesseract-ocr%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

