Hello Jimmy, Thank you for your message.
I'm writing between your lines: 2010/7/29 Jimmy O'Regan <[email protected]> > On 29 July 2010 03:23, Andres <[email protected]> wrote: > > Hello, > > > > I'm working on the same as you, for the licence plates from Argentina, as > I > > live in Argentina. > > > > Same as you described, the problem was to locate the licence plate. > > > > Now I'm working with the OCR and then I will work on horizontalizing the > > images, because if they are not completely horizontal, the OCR fails, for > > example today I was getting a 5 instead a of a 6. When I horizontalized > the > > image with photoshop, everything turned to ok. > > > > I dont know how is the layout of the positions of letters and numbers in > > California plates, are they assorted ? ...if you know if the character > > should be a number or a letter according to its position, you have two > > options (as far as I know): > > > > - when recognizing char by char, tell Tesseract that you expect a number > or > > a letter. I saw that in somewere inside the source code, don't remember > > where. > > You were probably looking at the code that guesses among 1, l and i > I think that I saw somewhere that it was possible to configure that you expect numbers or letters, but I'm not sure anymore. > > Most of the code in the dict/ directory does some variation on this, > by 'permuting' the character possibilities. > > > - make your own conversion, e.g., if you are expecting a number and you > get > > a G, map it to a 6, if you expect a 2 map it to a Z. > > > > Patrick may have more details on this approach. > > According to Wikipedia > (http://en.wikipedia.org/wiki/Vehicle_registration_plates_of_Argentina), > the normal Argentinian license plates follow the template AAA 000, so > you could just generate the possible combinations, and use them in a > dawg. > > perl -e 'for $a (65..90){for $b (65..90) {for $c (65..90) {printf > "%c%c%c\n", $a, $b, $c;}}}' > perl -e 'for $a (0..9){for $b (0..9) {for $c (0..9) {printf > "%d%d%d\n", $a, $b, $c;}}}' > > Will get you the two lists you want. > > Thank you very much for this idea. The resulting set of words (in the case of the six characters) would have a size of 17,576,000 lines. How is the access that makes tesseract to this ? Isn't it too big for that ? > (For the original question, according to > http://en.wikipedia.org/wiki/Vehicle_registration_plates_of_California > this is the California scheme: > perl -e 'for $a (0..9){for $b (65..90){for $c (65..90) {for $d > (65..90) {for $e (0..9){for $f (0..9) {for $g (0..9) {printf > "%d%c%c%c%d%d%d\n", $a, $b, $c, $d, $e, $f, $g;}}}}}}}' > > > I think that I'll use the last one, I'm not on that part yet. I'm getting > > good results on images where the characters are big because of the > distance > > of the camera, but in small letters (13 pixels height) things are not > good. > > > > So I have a pair of ideas to test, perhaps somebody from the group could > > give me opinions regarding them: > > - following the contour, with polygon approximation of the chars, making > an > > image with that contours and running Tesseract on that image (trained for > > that) > > Seems reasonable. Something like autotrace or potrace might be useful. > > Glad to read that. Since I use OpenCV I usually use cvFindContours() function and then cvApproxPoly() > > - make an image with my font (one of each from the alphabet), and > repeating > > the alphabet with different levels of threshold. I think that internally > > Tesseract thresholds the images. Hard to explain this, but I think that > it > > may improve the quality. > > Yes, Tesseract internally thresholds the image. I think Google did > something like this in the Tesseract 3 language packs, so it might be > worth doing. > > Do you know if it uses automatic threshold levels or if there is some place to configure it ? > > > > If you want to continue speaking about specifics of licence plate > > recognition, we can continue privately because it's off topic. I'm > > Well, you've earned my applause for recognising that, but if your > conversation turns up information that will save someone some time > later on, I'm all for it. > > great, I will be glad to share if something good appears. > > interested in continuing. There are many things to speak about, for > example, > > the prices of the cameras, light filters, times of execution, etc. > > > > You can write me to andrej100 at gmail > > > > Regards, > > > > Andres > > > > > > > > 2010/7/28 ZIA <[email protected]> > >> > >> I am writing a license plate recognition application in C#. I am > >> almost done, i have started work on my own OCR,but then I decided to > >> use tessearact-ocr, which now partially works. I provide the > >> california license plate to ocr, but some of the font, it doesn't > >> recognizes, for example, like "Z" becomes number 2, letter "O" becomes > >> "U", and number 4 becomes something else. Any suggestion? any language > >> file or font file that will solve this issue. Beside that in complex > >> images, i am having hard time to locate License plate. but my concern > >> is now on ocr, since i thought i would save time by using tesseract > >> then writing my own neural network. I would really appreciate any > >> ideas or suggestions. > >> > >> -- > >> You received this message because you are subscribed to the Google > Groups > >> "tesseract-ocr" group. > >> To post to this group, send email to [email protected]. > >> To unsubscribe from this group, send email to > >> [email protected]<tesseract-ocr%[email protected]> > . > >> For more options, visit this group at > >> http://groups.google.com/group/tesseract-ocr?hl=en. > >> > > > > -- > > You received this message because you are subscribed to the Google Groups > > "tesseract-ocr" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]<tesseract-ocr%[email protected]> > . > > For more options, visit this group at > > http://groups.google.com/group/tesseract-ocr?hl=en. > > > > > > -- > <Leftmost> jimregan, that's because deep inside you, you are evil. > <Leftmost> Also not-so-deep inside you. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<tesseract-ocr%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

