[tesseract-ocr] Reading game screenshots, completely lost.

davidms1031 Mon, 12 Sep 2016 23:41:36 -0700


<https://lh3.googleusercontent.com/-Ih4uQKB9hpc/V9eUe7ddQsI/AAAAAAAAAAM/skGFAMee73smMx4Q9U2NWZJ62oF0GrFAwCLcB/s1600/MenuAllText.png>
I just started using tesseract today with PyTesseract. I'm trying to have 
it read text from a game, it seems easy enough but I'm completely stuck. 
The picture is above is one I was trying to read. I thought it was a good 
test since there is just a black background and nothing to really mistake 
for a character. I can't get this to read at all though. If I scan in 
English it doesn't recognize Japanese. If I scan in Japanese it inserts 
kanji in place of English letters. If I use both it still misses a bunch of 
writing. I've tried just cropping down to the dialog box, but then it only 
reads the last 2 lines. If I shrink the image it'll only read the first 
half of the top two lines. Changing to gray-scale did nothing, forcing it 
to full contrast (either black or white pixels) made it worse. Also, when 
just scanning the dialog box it got basically every kanji wrong, it skipped 
other characters entirely, it also reads ら as a 6. Is this a font issue? Or 
does anyone know if there's some tricks to help with this? I'd appreciate 
any suggestions since I can't seem to get it to go anywhere.


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/0fd15456-ee1b-4427-8e5c-667db9b2b71d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Reading game screenshots, completely lost.

Reply via email to