Wow! This file works as good as the 20MB! (at least in my case) Any way it'll be great to know the steps to generate one of those files.
El lunes, 12 de septiembre de 2016, 14:18:50 (UTC+2), Quan Nguyen escribió: > > You may consider using the old versions of eng.traineddata file, one of > which is only 3MB. > > https://sourceforge.net/projects/tesseract-ocr-alt/files/ > > On Sunday, September 11, 2016 at 7:02:54 AM UTC-5, Brais Gabín Moreira > wrote: >> >> I'm using tesseract to recognice some screenshots. I'm building this in >> an Android app so ~20MB of traineddata is a lot of weight. I know the font >> in those screenshots. >> >> How can I reproduce the steps to generate the eng.traineddata? I want to >> use the same data: text, dictionary, patterns, etc. Once I have that, I'll >> strip out all the "useless" fonts and add the one I want. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/79101458-ca4a-409c-8ab6-050aa182dbf8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

