Hello, Permuting may work, but haven't tried it. I am also looking for font sample of CA license plate, which will help me in a way that i can train my own OCR. I really don't know where can I get the sample file A to Z and 0 to 9 of ca license plate font.
for LP extraction, i am trying to implement some kind of rectangle window (concept from SCW- in one paper). What i did, i applied the edge filter, which shows me the license plate clearly, i just need to extract them. one of simple approach of histogram works, if there is not a lot of noise, even reflection in images cause problem. On Jul 29, 5:38 am, "Jimmy O'Regan" <[email protected]> wrote: > On 29 July 2010 03:23, Andres <[email protected]> wrote: > > > > > Hello, > > > I'm working on the same as you, for the licence plates from Argentina, as I > > live in Argentina. > > > Same as you described, the problem was to locate the licence plate. > > > Now I'm working with the OCR and then I will work on horizontalizing the > > images, because if they are not completely horizontal, the OCR fails, for > > example today I was getting a 5 instead a of a 6. When I horizontalized the > > image with photoshop, everything turned to ok. > > > I dont know how is the layout of the positions of letters and numbers in > > California plates, are they assorted ? ...if you know if the character > > should be a number or a letter according to its position, you have two > > options (as far as I know): > > > - when recognizing char by char, tell Tesseract that you expect a number or > > a letter. I saw that in somewere inside the source code, don't remember > > where. > > You were probably looking at the code that guesses among 1, l and i > > Most of the code in the dict/ directory does some variation on this, > by 'permuting' the character possibilities. > > > - make your own conversion, e.g., if you are expecting a number and you get > > a G, map it to a 6, if you expect a 2 map it to a Z. > > Patrick may have more details on this approach. > > According to Wikipedia > (http://en.wikipedia.org/wiki/Vehicle_registration_plates_of_Argentina), > the normal Argentinian license plates follow the template AAA 000, so > you could just generate the possible combinations, and use them in a > dawg. > > perl -e 'for $a (65..90){for $b (65..90) {for $c (65..90) {printf > "%c%c%c\n", $a, $b, $c;}}}' > perl -e 'for $a (0..9){for $b (0..9) {for $c (0..9) {printf > "%d%d%d\n", $a, $b, $c;}}}' > > Will get you the two lists you want. > > (For the original question, according > tohttp://en.wikipedia.org/wiki/Vehicle_registration_plates_of_California > this is the California scheme: > perl -e 'for $a (0..9){for $b (65..90){for $c (65..90) {for $d > (65..90) {for $e (0..9){for $f (0..9) {for $g (0..9) {printf > "%d%c%c%c%d%d%d\n", $a, $b, $c, $d, $e, $f, $g;}}}}}}}' > > > I think that I'll use the last one, I'm not on that part yet. I'm getting > > good results on images where the characters are big because of the distance > > of the camera, but in small letters (13 pixels height) things are not good. > > > So I have a pair of ideas to test, perhaps somebody from the group could > > give me opinions regarding them: > > - following the contour, with polygon approximation of the chars, making an > > image with that contours and running Tesseract on that image (trained for > > that) > > Seems reasonable. Something like autotrace or potrace might be useful. > > > - make an image with my font (one of each from the alphabet), and repeating > > the alphabet with different levels of threshold. I think that internally > > Tesseract thresholds the images. Hard to explain this, but I think that it > > may improve the quality. > > Yes, Tesseract internally thresholds the image. I think Google did > something like this in the Tesseract 3 language packs, so it might be > worth doing. > > > > > If you want to continue speaking about specifics of licence plate > > recognition, we can continue privately because it's off topic. I'm > > Well, you've earned my applause for recognising that, but if your > conversation turns up information that will save someone some time > later on, I'm all for it. > > > > > interested in continuing. There are many things to speak about, for example, > > the prices of the cameras, light filters, times of execution, etc. > > > You can write me to andrej100 at gmail > > > Regards, > > > Andres > > > 2010/7/28 ZIA <[email protected]> > > >> I am writing a license plate recognition application in C#. I am > >> almost done, i have started work on my own OCR,but then I decided to > >> use tessearact-ocr, which now partially works. I provide the > >> california license plate to ocr, but some of the font, it doesn't > >> recognizes, for example, like "Z" becomes number 2, letter "O" becomes > >> "U", and number 4 becomes something else. Any suggestion? any language > >> file or font file that will solve this issue. Beside that in complex > >> images, i am having hard time to locate License plate. but my concern > >> is now on ocr, since i thought i would save time by using tesseract > >> then writing my own neural network. I would really appreciate any > >> ideas or suggestions. > > >> -- > >> You received this message because you are subscribed to the Google Groups > >> "tesseract-ocr" group. > >> To post to this group, send email to [email protected]. > >> To unsubscribe from this group, send email to > >> [email protected]. > >> For more options, visit this group at > >>http://groups.google.com/group/tesseract-ocr?hl=en. > > > -- > > You received this message because you are subscribed to the Google Groups > > "tesseract-ocr" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]. > > For more options, visit this group at > >http://groups.google.com/group/tesseract-ocr?hl=en. > > -- > <Leftmost> jimregan, that's because deep inside you, you are evil. > <Leftmost> Also not-so-deep inside you. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

