Hi Dears , I would like to build *.traindata from scratch specially for English and Arabic
So lets talk about English as example my question how to prepare fonts folder? i read the https://github.com/tesseract-ocr/tesseract/blob/master/src/training/language-specific.sh file i found the this file contain about only 32 font name should i add other Latin fonts installed in the training machine to the previous file "language-specific.sh" ? i used "font manger" tool and i found about 147 font installed in training machine i opended https://github.com/tesseract-ocr/langdata_lstm/blob/master/eng/okfonts.txt and it contain 4567 font name should i search and download and install all missing fonts in the training machine ? should i collect all fonts files from training machine and create new fonts folder "HOME/.fonts" and paste all fonts in that folder? i see fonts have diffirent extentions "*.ttf , *.otf , *.afm , ... " does all font types work in training or i need specific type ? I will write another question about the required text data . Thanks for help Regards Essam -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/e605a197-000c-444a-9969-dd10346f2028%40googlegroups.com.

