Hello

I am making use of Tesseract OCR to perform number plate recognition on 
vehicles

I am making use of jTessBoxEditor v1.1 to check my box and tif files

At the moment each iteration of my training consists of using about 250 - 
300 number plates

I have read in many places that one should train fonts separately. This is 
difficult in my case as my source of images of number plates consists of 
number plates with varying font's unless I manually look through each one 
of the 100 initial images I use per training iteration to separate them 
into different groups. Would this really be neccessary?

I have been doing training for over a month now and probably trained on 
over 1000 images and 3000 number plates and seem to not be able to get a 
better accuracy percentage of over 86%

I was wondering if you have some suggestions as ideally I would like to see 
in excess of 90% accuracy

What I have picked up is that the OCR struggles with certain problem 
characters : O vs 0, 5 vs S, 2 vs Z, B vs 8

Is there a specific way of training that I should use to improve correct 
reads of these letters. During my editting of the tif/box in jTessBoxEditor 
I am torn between discarding the bad quality read characters and only 
keeping the good quality read characters vs correcting each and every 
character to be what it should be regardless of the quality of the 
character in the tif file. Which is the better approach and why?

Any other suggestions on how to improve my training using jTessBoxEditor 
greatly appreciated

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/a2fd4dbe-d75c-4e20-9fd3-396569833e9d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to