Well again I'm not sure if I'm supposed to reply to my own topic but as I 
thought using the original method I used on that selection screen leads to 
complete crap, too much of the image is left and tesseract literally gives 
up and returns an empty string. So I made another filter to pass the image 
through, so now with 2 filters on the image it can read relatively 
reliably, but still specs are left that it tries to read. The first picture 
is the original image and the other is that image after the 2 filters.

<https://lh3.googleusercontent.com/-4SvQpWIhIRI/V9t3W2eLyxI/AAAAAAAAAAs/SIJNXzAj1G4i-jtTbM57dbm7WDkXclhQQCLcB/s1600/Test.png>
<https://lh3.googleusercontent.com/-CWAGJKRr04E/V9t3dpGjK2I/AAAAAAAAAAw/NWjg97Rpn-khjWUbSme9oQWprhw4wn-gACLcB/s1600/2filter.png>


The results of the tesseract scan were:

」ブ「.'~ー ' .'~ー ' .'~ー ' .'~ー ' .'~ー ' .'~ー '

鱒` 私、 お藁り見に来たんだ。

` ねえ、 あなたこの町の人でしょ ?
一人じゃ面臼くないもん。


The first line is obviously it trying to read the specs left over on the 
top. and aside from an extra kanji and an apostrophe the reading is right 
on. I tried it on another screenshot with different sprites and got.

)“シナ「とつせゅつへ` こうふんして
` ねつけなかったんで しょ?
ま 、 建国千年のお祭りだから
無理ないけど ・・・・・・


Which is again pretty close, the extra )" and the つ in the first line is 
actually a う, and the へ is actually a べ which are easy to mess up. But the 
problem is the filters are really specific to this game at the moment and I 
was hoping to keep them more generalized also there's no way to really tell 
if a a kanji that's in the reading has been placed there in error. I want 
to make a program that periodically tries to read text from the game as I'm 
playing and perform some functions on it. Any ideas? I may just end up 
looking into another route, this one seemed the simplest but the errors 
could mess up the functionality I'm trying to achieve. 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/e04bf700-543b-4086-a21c-cfc9b45affe9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to