Re: California License Plate font issues with OCR

ZIA Fri, 30 Jul 2010 08:26:30 -0700

Hello,

Permuting may work, but haven't tried it. I am also looking for font
sample of CA license plate, which will help me in a way that i can
train my own
OCR. I really don't know where can I get the sample file A to Z and 0
to 9 of ca license plate font.


for LP extraction, i am trying to implement some kind of rectangle
window (concept from SCW- in one paper). What i did, i applied the
edge filter, which shows me the license plate clearly, i just need to
extract them. one of simple approach of histogram works, if there is
not a lot of noise, even reflection in images cause problem.

On Jul 29, 5:38 am, "Jimmy O'Regan" <[email protected]> wrote:
> On 29 July 2010 03:23, Andres <[email protected]> wrote:
>
>
>
> > Hello,
>
> > I'm working on the same as you, for the licence plates from Argentina, as I
> > live in Argentina.
>
> > Same as you described, the problem was to locate the licence plate.
>
> > Now I'm working with the OCR and then I will work on horizontalizing the
> > images, because if they are not completely horizontal, the OCR fails, for
> > example today I was getting a 5 instead a of a 6. When I horizontalized the
> > image with photoshop, everything turned to ok.
>
> > I dont know how is the layout of the positions of letters and numbers in
> > California plates, are they assorted ? ...if you know if the character
> > should be a number or a letter according to its position, you have two
> > options (as far as I know):
>
> > - when recognizing char by char, tell Tesseract that you expect a number or
> > a letter. I saw that in somewere inside the source code, don't remember
> > where.
>
> You were probably looking at the code that guesses among 1, l and i
>
> Most of the code in the dict/ directory does some variation on this,
> by 'permuting' the character possibilities.
>
> > - make your own conversion, e.g., if you are expecting a number and you get
> > a G, map it to a 6, if you expect a 2 map it to a Z.
>
> Patrick may have more details on this approach.
>
> According to Wikipedia
> (http://en.wikipedia.org/wiki/Vehicle_registration_plates_of_Argentina),
> the normal Argentinian license plates follow the template AAA 000, so
> you could just generate the possible combinations, and use them in a
> dawg.
>
>  perl -e 'for $a (65..90){for $b (65..90) {for $c (65..90) {printf
> "%c%c%c\n", $a, $b, $c;}}}'
>  perl -e 'for $a (0..9){for $b (0..9) {for $c (0..9) {printf
> "%d%d%d\n", $a, $b, $c;}}}'
>
> Will get you the two lists you want.
>
> (For the original question, according 
> tohttp://en.wikipedia.org/wiki/Vehicle_registration_plates_of_California
> this is the California scheme:
> perl -e 'for $a (0..9){for $b (65..90){for $c (65..90) {for $d
> (65..90) {for $e (0..9){for $f (0..9) {for $g (0..9) {printf
> "%d%c%c%c%d%d%d\n", $a, $b, $c, $d, $e, $f, $g;}}}}}}}'
>
> > I think that I'll use the last one, I'm not on that part yet. I'm getting
> > good results on images where the characters are big because of the distance
> > of the camera, but in small letters (13 pixels height) things are not good.
>
> > So I have a pair of ideas to test, perhaps somebody from the group could
> > give me opinions regarding them:
> > - following the contour, with polygon approximation of the chars, making an
> > image with that contours and running Tesseract on that image (trained for
> > that)
>
> Seems reasonable. Something like autotrace or potrace might be useful.
>
> > - make an image with my font (one of each from the alphabet), and repeating
> > the alphabet with different levels of threshold. I think that internally
> > Tesseract thresholds the images. Hard to explain this, but I think that it
> > may improve the quality.
>
> Yes, Tesseract internally thresholds the image. I think Google did
> something like this in the Tesseract 3 language packs, so it might be
> worth doing.
>
>
>
> > If you want to continue speaking about specifics of licence plate
> > recognition, we can continue privately because it's off topic. I'm
>
> Well, you've earned my applause for recognising that, but if your
> conversation turns up information that will save someone some time
> later on, I'm all for it.
>
>
>
> > interested in continuing. There are many things to speak about, for example,
> > the prices of the cameras, light filters, times of execution, etc.
>
> > You can write me to andrej100 at gmail
>
> > Regards,
>
> > Andres
>
> > 2010/7/28 ZIA <[email protected]>
>
> >> I am writing a license plate recognition application in C#. I am
> >> almost done, i have started work on my own OCR,but then I decided to
> >> use tessearact-ocr, which now partially works. I provide the
> >> california license plate to ocr, but some of the font, it doesn't
> >> recognizes, for example, like "Z" becomes number 2, letter "O" becomes
> >> "U", and number 4 becomes something else. Any suggestion? any language
> >> file or font file that will solve this issue. Beside that in complex
> >> images, i am having hard time to locate License plate. but my concern
> >> is now on ocr, since i thought i would save time by using tesseract
> >> then writing my own neural network. I would really appreciate any
> >> ideas or suggestions.
>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "tesseract-ocr" group.
> >> To post to this group, send email to [email protected].
> >> To unsubscribe from this group, send email to
> >> [email protected].
> >> For more options, visit this group at
> >>http://groups.google.com/group/tesseract-ocr?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "tesseract-ocr" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected].
> > For more options, visit this group at
> >http://groups.google.com/group/tesseract-ocr?hl=en.
>
> --
> <Leftmost> jimregan, that's because deep inside you, you are evil.
> <Leftmost> Also not-so-deep inside you.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Re: California License Plate font issues with OCR

Reply via email to