You're spot on - Tesseract is not going to invent missing pixels for you
nor is dilation/erosion preprocessing. You have to start with some pixels
to stand any chance of success and your receipt exhibits as you note many
areas of rubbed out characters.

Perhaps in some future world there is a way of using ML techniques to train
a system on rubbed out receipts such that it can posit what words suffering
rubbed out characters are likely to be.

On 27 September 2016 at 14:24, CodeBreaker <[email protected]> wrote:

> For receipt, so far i try all the psm option, the best is psm(6)...and
> resize the resolution. But for the rubbed out char, that's hardly anything
> tesseract can do. Correct me if im wrong. :-)
>
> On Thursday, 21 May 2015 19:48:41 UTC+8, Claudi Ruiz wrote:
>>
>>
>> <https://lh3.googleusercontent.com/-9BfUcgsF-pE/VV3EsRVTEoI/AAAAAAAAa3A/JUyuvbc2duM/s1600/result.jpg>
>> *Goal: *Improve as much as possible the tesseract output.
>> *Difficulties: *different character sizes and poor image content quality.
>> *Already done:* binarize, dilate and erode.
>>
>> *DO YOU HAVE ANY IDEA HOW TO IMPROVE MY OUTPUT??!!*
>>
>> *P.S. *I have already checked: https://code.google.c
>> om/p/tesseract-ocr/wiki/ImproveQuality
>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/6f158137-7ae2-4c27-a96a-e636b6b1da63%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/6f158137-7ae2-4c27-a96a-e636b6b1da63%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAORW5vi9BicK9xRX22OQb5maxYbSoWsbJGSZ_BgKUFjstm_zLQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to