I'd try to upscale the images so that one letter is about 40/50 pixels tall
and see if that helps.
I'd also try a morphological open/erode operation (or a blur/resharpen) to
simply fill the holes.

I do not know if there are any special parameters for this kind of problems
(that I've encountered too).

In general, adding noise to training data make the model more robust. You
may use custom code or something like imgaug
<https://github.com/aleju/imgaug> to generate random variations with random
white spots and other corruptions.


Bye

Lorenzo

2018-06-22 5:04 GMT+02:00 blues <[email protected]>:

> Hi all,
> I'm using tesseract for number plate recognition.(openalpr) it passes
> single character to tesseract for recognition.
>
> I found that recognition accuracy is very sensitive to holes on character.
> if the character in binary image has one or more small holes on it, than
> its likely to get a wrong result.
>
> for example, this "0" is falsely recognized as "Z"
>
>
> <https://lh3.googleusercontent.com/-DNd643TD6oQ/WyxkoE_K7FI/AAAAAAAAADs/1wTR_01Uta43AjSs7QAFXsHLFqUzPJWhACLcBGAs/s1600/0-1-as-z.png>
>
> but just a single pixel different, which opens the hole on its upper part,
> than its correctly recognized as "0"
>
>
> <https://lh3.googleusercontent.com/-QFh_EZaTVqQ/Wyxk0As4GbI/AAAAAAAAADw/9KrQ0CufD-g4G5zSKgrfjF5U_64iVJDygCLcBGAs/s1600/0-1.png>
> some more examples are attached.
>
> I can not predict where the holes going to be, because it caused by noise
> in image. so I think it should not be added into training samples.
> Is there a way to fix it? to make recognition robust to small noise
>
> thank you
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/3f59708b-d55a-499b-9ce6-035f492dfe89%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/3f59708b-d55a-499b-9ce6-035f492dfe89%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLwzA%3Dcesj5wMHtaNuWrRPpY%2BiBR-4r7vDV3Pmv2FnZpJQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to