Hmmm, fixed image size, fixed region, constant colors, monospace raster
font...

Do you really want to engage a whole algorithmic monster to handle a
problem like this? Not to mention poor performance, training,
preprocessing, coping with all sorts of recognition problems is guaranteed.

Pixel-to-pixel matching is the way to go!
100% accuracy.

Even if you not willing to resort to full fledged programming - just crop
out 10 digit samples and match them to your input image using a shell
script loop. Give your ImageMagick-fu a chance. Or, you can even use file
compare! ))

HTH

Best regards,
Dmitri Silaev
www.CustomOCR.com





On Thu, Apr 23, 2015 at 9:05 AM, Leah Siddall <
[email protected]> wrote:

> Hi all!
>
> I am not having luck with tesseract and the fonts used in NES games like
> Super Mario Bros. 3. ( i've attached an example screenshot ).
> My goal is scrape a screenshot for the "score" and "time remaining". The
> idea is to feed that into a database during a competition to minimize
> cheating.
>
> I've tried cropping, resizing, grayscale, and negating with PNG, TIF, JPG,
> and PNM formats then going through every PSM mode on each with poor
> results.
> The original screenshot is PNG 4800 × 3600 pixels at 144 pixels/inch
> straight from the emulator which is like the best possible situation.
>
> Just trying to get a baseline, I tried against the "Punch Out" screenshot
> ( attached ) where the fonts are clearly spaced and lots of empty space. It
> would get "CDHTIHUE" and "Nintendo", but totally missing the word "new"
> between the boxing gloves and and jumbling the year numbers.
>
> To rule out user error, I did run against other images with more standard
> fonts and had no problems.
>
> I'm quite comfortable with imagemagick but very new to tesseract.
> I am using tesseract version from "brew install tesseract -HEAD" on
> OSX 10.10.2
> tesseract 3.04.00
>  leptonica-1.71
>   libjpeg 8d : libpng 1.6.16 : libtiff 4.0.3 : zlib 1.2.5
>
> This would be really really cool to pull off if possible. any suggestions
> are greatly appreciated.
> thanks!! -leah
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/2088977c-529b-45bd-8059-b6906fb666ce%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/2088977c-529b-45bd-8059-b6906fb666ce%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAKzLxFMxERneM3ufi7FA0xx7YV3CUTmpKzvj8Sp%2B_p6%3DQT64%2BQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to