Hi,

Does anybody know which fonts are used in the default training set for
English? I've analyzed the text I'll be converting, so I know which
fonts I need to handle, but I don't know what's in the default set.

Does anybody know?

Also, sorta relatedly, if people are interested, I've posted a Python
script that can analyze the fonts used in PDFs here:
http://michaeljaylissner.com/blog/and-the-winning-font-in-court-documents-is.
It's useful if you have a set of documents, some of which need OCR,
and some which don't.

Thanks,

Mike

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to