tesseract misreading exact training images

Jonah Wed, 14 Apr 2010 18:14:26 -0700

I created a font image for tahoma using photoshop to train tesseract.
The training image had one of each letter in the alphabet, cap and
lowercase, and the ten digits.  After the training was complete, I
created some sample images with full words, and for the most part it
works great.  But tesseract is still reading an "m" sometimes as
"rn".  Note that the letters in the sample it's misreading are EXACTLY
the same as the training data, down to the pixel.  So I don't see how
it is making errors.  I have also added the line:


2 rn 1 m

to my DangAmbigs file, but this has no effect.  Can anyone help?

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

tesseract misreading exact training images

Reply via email to