Mark,
Did you find a solution to line below(extracted from your original
msg) ? If so , please let me know. Thanks
*What I have picked up is that the OCR struggles with certain problem
characters : O vs 0, 5 vs S, 2 vs Z, B vs 8*
On Thursday, November 20, 2014 7:53:43 AM UTC-5, Mark Beylis wrote:
>
> Hello
>
> I am making use of Tesseract OCR to perform number plate recognition on
> vehicles
>
> I am making use of jTessBoxEditor v1.1 to check my box and tif files
>
> At the moment each iteration of my training consists of using about 250 -
> 300 number plates
>
> I have read in many places that one should train fonts separately. This is
> difficult in my case as my source of images of number plates consists of
> number plates with varying font's unless I manually look through each one
> of the 100 initial images I use per training iteration to separate them
> into different groups. Would this really be neccessary?
>
> I have been doing training for over a month now and probably trained on
> over 1000 images and 3000 number plates and seem to not be able to get a
> better accuracy percentage of over 86%
>
> I was wondering if you have some suggestions as ideally I would like to
> see in excess of 90% accuracy
>
> What I have picked up is that the OCR struggles with certain problem
> characters : O vs 0, 5 vs S, 2 vs Z, B vs 8
>
> Is there a specific way of training that I should use to improve correct
> reads of these letters. During my editting of the tif/box in jTessBoxEditor
> I am torn between discarding the bad quality read characters and only
> keeping the good quality read characters vs correcting each and every
> character to be what it should be regardless of the quality of the
> character in the tif file. Which is the better approach and why?
>
> Any other suggestions on how to improve my training using jTessBoxEditor
> greatly appreciated
>
> Thanks
>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/8a371a2f-c5b4-44c4-af2c-ccb2670e5723%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.