[tesseract-ocr] How to effectively train tesseract

Jonas Fri, 11 Apr 2014 06:10:20 -0700

Hello

I am currently trying to improve tesseracts recognition rate by messing 
with the traineddata files. I have two questions:


1. I have to read many different fonts, is it useful to train tesseract in 
as many fonts as possible or is there a limit to what is useful?

2. I have read that the training text shouldn't just be "ABCDEFG...", that 
it should be more realistic. In my case, I have to read Adress lists. Would 
tesseract be better if my training text is also an adress list or is "The 
big brown fox..." enough?


thanks for the help!

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] How to effectively train tesseract

Reply via email to