Hi everyone, I am using Tesseract-OCR 3.01 to do license plate recognition. I trained the database with 200 license plate images without preprocessing.(License plate have been located from pictures by my algorithm) Every char has at least 15 samples. However, the results I got are not good enough.
First, it can't recognize '8' and 'B' with my trained data, but if I just recognized '8' and 'B' by the data downloaded from tesseract website, it even got the better result. I don't know if there are any limitation with data for training. Otherwise, why do I got worse results by using my trained data? Some of my data are clear, some are blur, and some are skew. However, I just input the original images without preprocessing to train. Moreover, if I crop the license plate larger from pictures, the redundant part would be recognized as a char or merged with nearby number to become a wrong char. Or, if a image is taken in a dark place, it can identify rarely. I think maybe I can do some preprocessing to remove redundant part or noise, or processing my image char by char could get the better result. Is there any suggestion to improve it? Regards, Dena -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

