You can change the fontlist either in language-specific.sh or as a parameter when you run tesstrain.sh
Read the wiki pages regarding training for more info. On 31-Aug-2017 9:38 PM, "ShreeDevi Kumar" <[email protected]> wrote: > Please see tesseract.sh script file in training directory. > > It automates the whole training process. > > On 31-Aug-2017 9:29 PM, "Dan9er" <[email protected]> wrote: > >> Running >> training/text2image --text=npn_training_text.txt --outputbase=npn.Exo.exp0 >> --font='Exo' --fonts_dir=/usr/share/fonts >> >> gives the desired output of two files: >> >> - npn.Exo.exo0.tif >> - npn.Exo.exp0.box >> >> But running this command for the 162 fonts I want to use is very time >> consuming and monotonous. I tried running this command: >> training/text2image --text=npn_training_text.txt --outputbase=npn -- >> fonts_dir=/usr/share/fonts --find_fonts --min_coverage=1.0 -- >> render_per_font=true >> >> But that only made files in this format: npn.{fontName}.tif >> >> *How do I automate making .tif AND .box files?* Do I have to change the >> --outputbase to something different or do I have to make a .sh script? >> >> PS. I did run training/text2image --find_fonts with --render_per_font >> set to false, so I have a npn.fontlist.txt file on hand. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit https://groups.google.com/d/ms >> gid/tesseract-ocr/9d7df5ab-e1ad-43a6-9d7b-d7ba4ef39951%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/9d7df5ab-e1ad-43a6-9d7b-d7ba4ef39951%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXkCVVbF78QnsTF_pkM1_xt6L2cK5SBzfYY%2Bi9KbgSc7w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

