Okay, I figured this out.  It was indeed the small type causing the
problem, but there is a way around the issue.  In photoshop, I create
the type sample at the small size I want to train at (11px Tahoma in
this case).  Then I flatten the image, and scale it to twice its
original size, using nearest neighbor rescaling.  I then train
tesseract on this enlarged sample.  Then, when I am actually reading
the 11px TIFs that I want tesseract to read, I also re-scale them  to
twice their original size before feeding them into tesseract.
Tesseract has been 100% accurate so far with this method.



-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to