Thank you SO much for the replies guys!!

I read up on those binarization links, and that looks like its going to be 
a bit out of my wheel house to implement, I see that there is a 
python/openCV implementation of that paper, but im not sure if I could get 
that going, as im not familiar with either. I looked at the image file its 
using right before it processes it via the tessedit_write_image config and 
the quality is good and everything is sharp, so im not sure how much it 
would help. http://i.imgur.com/ljBtNMQ.jpg (other than removing the 
gibberish)

I tried going a character at a time, however for some reason I cant seem to 
get tesseract to work when I give it just one character, it doesnt see 
anything. If I give it a block of 4 tiles then it works. I tried all the 
different pageseg_mode options as well. ???

In one of the links tho I saw something about -psm setting. When I run the 
OCR with -psm 6 all of a sudden it worked perfect!!! Im really not sure 
what that setting does, ive tried doing some searches, but im still 
unclear. Can you guys shed some light on that? I made a box file from that 
setting and put together a new traineddata file with that. Now if I try and 
run it using that language on anything other than -psm 6 it crashes 
tesseract. Is this something I need to be concerned about?

My plan now is to do another set of better box files to make a new language 
using the -psm 6 with the current traineddata. I'm hoping that it will be 
able to distinguish between the normal and wild card tiles this way. 
*fingers crossed*

thank you again for taking the time to reply, it was a very big help

cheers,

Alex

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/844bd7fd-74ee-4d37-b8b7-940e59eb8398%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to