jpn.config in langdata/jpn is loading jpn_vert as a sublanguage tessedit_load_sublangs jpn_vert
You can try without that Also look at the settings for jpn in training/language_specific.sh You may need to change the following also .. # The following fonts will be rendered vertically in phase I. VERTICAL_FONTS=( \ "TakaoExGothic" \ # for jpn "TakaoExMincho" \ # for jpn "AR PL UKai Patched" \ # for chi_tra "AR PL UMing Patched Light" \ # for chi_tra "Baekmuk Batang Patched" \ # for kor ) ShreeDevi ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Mon, Apr 3, 2017 at 4:22 PM, <atuyosi.unloc...@gmail.com> wrote: > Hi, > > I'm trying to creating training data for Japanese (jpn.traineddata). > > I run 'tesstrain.sh' with '--linedataonly' option, and the script has > finished ( return code 0 ) . > But log file has contained some error messages ( repeated 22 times ). > > ``` > $ ../tesseract-ocr/training/tesstrain.sh --fonts_dir /usr/share/fonts > --lang jpn --linedata_only --noextract_font_properties --langdata_dir > ../langdata --tessdata_dir /usr/local/share --output_dir ~/work/jpntrain > ``` > > > --- > [Sun Apr 2 07:42:30 UTC 2017] /usr/local/bin/tesseract > /tmp/tmp.pwcwGMb5hs/jpn/jpn.IPAPMincho.exp0.tif > /tmp/tmp.pwcwGMb5hs/jpn/jpn. > IPAPMincho.exp0 lstm.train ../langdata/jpn/jpn.config > [Sun Apr 2 07:42:30 UTC 2017] /usr/local/bin/tesseract > /tmp/tmp.pwcwGMb5hs/jpn/jpn.IPAGothic.exp0.tif > /tmp/tmp.pwcwGMb5hs/jpn/jpn.I > PAGothic.exp0 lstm.train ../langdata/jpn/jpn.config > Error opening data file /usr/local/share/tessdata/jpn_vert.traineddata > Please make sure the TESSDATA_PREFIX environment variable is set to the > parent directory of your "tessdata" directory. > Failed loading language 'jpn_vert' > --- > > It seems that 'tesstrain.sh' requires 'jpn_vert.traineddata`, but this > file not provide on tessdata repository. > > How I get this file? Or, Can I substitute 'jpn.traineddata' for > 'jpn_vert.traineddata' ? > > > I've found that there is `jpn_vert' directory on langdata repository, but > only some config files. > > > Thanks. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/tesseract-ocr/c776398d-0b2f-483d-a9ec-63476eaf0586% > 40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/c776398d-0b2f-483d-a9ec-63476eaf0586%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUXiMCsyMXtaV-mBiq1E1OhJqV-obaMHLkizjnivUMtiQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.