On Mar 31, 6:48 am, Leith Bade <[email protected]> wrote: > Hi, > > I am working on a image processing assignment and I would like to use > Tesseract for recognising the letters/numbers on the plate after it > has been located. > > I want to use Tesseract as OCR is hard and this library claims that it > can handle skewed and curved lines. > > I seems to work reasonably well but I think I need to tweak the > settings. > > So far I have told it to only look for A-Z and 0-9. > > It seems to try to break the number plate up into words, when it > should be a single 'word'. > > It also tries to stick words into all numbers or all letters rather > than letting it mix. e.g. 4536B becomes 45368. > > So how do I get it to disable the word breaking, and disable the > dictionary/number classifier part. > > Is it possible to tell it some sort of pattern to match? All the > number plates I need to recognise follow these patterns: > S[C or D][A to Z]xxxx[A to Z] > S[C or D][A to Z]xxx[A to Z] > S[C or D][A to Z]xx[A to Z] > EA to Z]xxxx[A to Z] > > Some of the S... number plates are split into two lines: > > S[C or D][A to Z] > xxxx[A to Z] > > S[C or D][A to Z] > xxx[A to Z] > > x = [0 to 9] > > Thanks, > Leith
hi, Leith, I believe the real challenge to apply ocr for plate recognition is that the plate image are "too dirty" comparing to paper documents. There are frames, skews, un-even shadows, etc. You have to do your own work to parse the plate into separate chars and feed the ocr engine. I don't think tesseract itself can handle this automatically given the raw image. But I believe it will do pretty well once you get the binarized separate chars. Basically, plate recognition is more a image processing problem than ocr problem. You can use the grammar as post-process to make corrections. That's my 2cents. zl2k -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

