That answer doesn't help me. How can I add dictionary files to tesstrain?
On Saturday, September 23, 2017 at 12:05:37 PM UTC-4, shree wrote: > > You cannot use a random unicharset, it needs to be the same one used for > training the model. > > For multiple exposures, use the following method > > training/tesstrain.sh \ > --fonts_dir /mnt/c/Windows/Fonts \ > --lang eng \ > --noextract_font_properties --linedata_only \ > --exposures "-1, 0, 1" \ > --langdata_dir ../langdata \ > --tessdata_dir ../tessdata \ > --fontlist \ > "Arial" \ > "Tahoma" \ > "Times New Roman," \ > "Sanskrit 2003," \ > "FreeSerif Italic" \ > "Times New Roman, Italic" \ > --output_dir ../tesstutorial/eng > > > ShreeDevi > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > On Sat, Sep 23, 2017 at 8:46 PM, Dan9er <[email protected] > <javascript:>> wrote: > >> I'm making a unicharset file so I can compile DAWG dictionary files so I >> can use it with tesstrain.sh. I want to use multiple exposures (-1, 0,1) >> for the tiff/box pairs. How should name them to separate the >> different exposures? >> >> Can I do this?: >> >> lang.Arial.exp0 >> lang.Arial.exp1 >> lang.Arial.exp2 >> >> Or will changing the file numbers screw things up? As an alternative, can >> I do this?: >> >> lang.Arial0.exp0 >> lang.Arial1.exp0 >> lang.Arial2.exp0 >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/6e9f4a45-5dde-41f6-8a41-a403778aef54%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/6e9f4a45-5dde-41f6-8a41-a403778aef54%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f473592f-3bc3-4e8f-b625-6a14b2d3bfba%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

