https://github.com/tesseract-ocr/langdata/blob/master/common.punc
You should read the Readme.md in langada repo for info on the files required for training g On 10-Sep-2017 12:39 AM, "Dan9er" <[email protected]> wrote: Ok, I made a sh that runs tesstrain.sh with all 562 compatible fonts. But now I'm getting an error saying ./langdata/common.punc does not exist... https://pastebin.com/8aaMjH6k On Saturday, September 9, 2017 at 12:51:45 PM UTC-4, shree wrote: > Your command needs to be on the following lines: > > training/tesstrain.sh \ > --fonts_dir /home/shree/.fonts \ > --tessdata_dir ./tessdata \ > --training_text ../langdata/ben/ben.training_text \ > --langdata_dir ../langdata \ > --lang ben \ > --linedata_only \ > --noextract_font_properties \ > --exposures "0" \ > --fontlist "e-Grantamil" \ > "e-Grantha OT" \ > --output_dir ~/tesstutorial/ben > > See the fontlist argument, it is quoted names of the fonts. You can put > one on each line with \ > > > > ShreeDevi > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > On Sat, Sep 9, 2017 at 10:12 PM, Dan9er <[email protected]> wrote: > >> Nope. 😢 >> https://pastebin.com/BskUsSm7 >> >> On Saturday, September 9, 2017 at 11:57:18 AM UTC-4, Dan9er wrote: >>> >>> I think I now know how to do it. >>> >>> I have to run training/text2image --find_fonts and then set the >>> tesstrain --fontlist flag to the file that is generated. >>> >>> On Thursday, September 7, 2017, at 2:19:09 PM UTC-4, Dan9er wrote: >>>> >>>> I'm trying to train tesseract using tesstrain and I'm getting this >>>> error: https://pastebin.com/xJj3w9jZ >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit https://groups.google.com/d/ms >> gid/tesseract-ocr/ee1d68eb-a92c-4a30-905f-ac52128bccb6%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/ee1d68eb-a92c-4a30-905f-ac52128bccb6%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/ msgid/tesseract-ocr/43979ac1-6555-4ae3-a6da-330c3b0dce16%40googlegroups.com <https://groups.google.com/d/msgid/tesseract-ocr/43979ac1-6555-4ae3-a6da-330c3b0dce16%40googlegroups.com?utm_medium=email&utm_source=footer> . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduU6JO0p%2BpzWn5hyjeHcuhX1oh%2BxPpViqMkrgq0owgFOOQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

