Do you use scripts from master repository? There where some updates after 4.0 release...
Zdenko st 5. 12. 2018 o 8:19 SEUNGGWANSHIN <tmdrhsl...@gmail.com> napísal(a): > hello guys > > i'm training tesseract-lstm with > https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 > i have some problem using "tesstrain.sh" > > When creating train data, this website used tesstrain.sh this way. > > src/training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng > --linedata_only \ > > --noextract_font_properties --langdata_dir ../langdata \ > > --tessdata_dir ./tessdata --output_dir ~/tesstutorial/engtrain > > > So my code is below. > > tesstrain.sh --fonts_dir /usr/share/fonts \ > > --lang kor \ > > --linedata_only \ > > --noextract_font_properties \ > > --langdata_dir ../langdata-master \ > > --tessdata_dir tessdata/tessdata_fast/ \ > > --output_dir kortrain > > My language is* "kor" *not "eng" ... > when i executed those script, i got unknown error like this. > > === Starting training for language 'kor' > > /usr/local/bin/language-specific.sh: 줄 1125: FONTS: unbound variable > > > and i checked this error line in language-specific.sh. > > 1124 kor ) MEAN_COUNT="20" > > 1125 WORD_DAWG_FACTOR=0.015 > > 1126 NUMBER_DAWG_FACTOR=0.05 > > 1127 TRAINING_DATA_ARGUMENTS+=" --infrequent_ratio=10000" > > 1128 TRAINING_DATA_ARGUMENTS+=" --desired_bigrams=" > > 1129 GENERATE_WORD_BIGRAMS=0 > > 1130 FILTER_ARGUMENTS="--charset_filter=kor --segmenter_lang=kor > " > > 1131 test -z "$FONTS" && FONTS=( "${KOREAN_FONTS[@]}" ) ;; > > 312 KOREAN_FONTS=( \ > > 313 "Arial Unicode MS" \ > > 314 "Arial Unicode MS Bold" \ > > 315 "Baekmuk Batang Patched" \ > > 316 "Baekmuk Batang" \ > > 317 "Baekmuk Dotum" \ > > 318 "Baekmuk Gulim" \ > > 319 "Baekmuk Headline" \ > > 320 ) > > I installed perfectly korean_fonts using ttf_mscorefonts_installer, etc.. > but i dont know why this error happens.. > > Anyone help me ! > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/2b7bbf45-4240-411b-bd4a-87c46fdcea5a%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/2b7bbf45-4240-411b-bd4a-87c46fdcea5a%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xdwe36ORbpvjm0s79zQhNE%2BNFmgsa1c4%2B_N1yfROtBdQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.