Re: Different output for almost identical images

Zdenko Podobný Fri, 06 Apr 2012 11:35:03 -0700

Dňa 06.04.2012 17:35, Rufus wrote / napísal(a):
> Thanks for the reply.
>
> I've tried another image(bad2.tiff), which is still a bit different from 
> good.tiff, and is of the same order regarding the compression ratio. 
> However, tesseract still doesn't output anything for bad2.tiff.
> I then tried to feed tesseract with only the first character, and there is 
> works for bad_char.tiff (from bad.tiff) but it doesn't work for 
> bad2_char.tiff (from bad2.tiff).
>
> Commands:
> tesseract bad_char.tiff bad_char -l eng -psm 10 nobatch digits
> tesseract bad2_char.tiff bad2_char -l eng -psm 10 nobatch digits
>
>
> All the images attached are actually thresholded. I guess there is not much 
> room for improvement there. I've also tried by training tesseract with a 
> new language consisting only of digits with a particular font (font: Impact 
> .... looks like the font in the images). Do you also experience these 
> problems when using tesseract?
>
I think problem is with size of text, resolution and missing border. I
tried this:
convert -border 500 -resample 300 -density 300 -resize 50 bad2.tiff bad2.png
and
tesseract bad2.png bad2
produced results.


-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Re: Different output for almost identical images

Reply via email to