Maybe it would be good to provide some examples of input.

Zdenko


pi 25. 9. 2020 o 7:57 Radu Stoicescu <[email protected]> napĂ­sal(a):

> I have some scanned, machine typed, that have a lot of noise. I can reduce
> the noise, and I have done so. But there is some noise that is
> statistically indistinguishable from letters: as dark as the letters and as
> big as the letters, therefore I cannot just take it out.
>
> I have tried to only train Tesseract on Courier New, and although the
> accuracy went down, which was expected because I did not use enough data,
> there were still letters detected in the noisy areas.
>
> How can I keep Tesseract from detecting letters in noise? One simple rule
> would be to only detect characters of one size, since this is machine typed
> text.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/24c0b6ae-e07b-443b-ba60-38470b852275n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/24c0b6ae-e07b-443b-ba60-38470b852275n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zm1zQw1YJO8weKmO1_Y6mz4HK4FDtC1aUqaXHgRSYmPw%40mail.gmail.com.

Reply via email to