hello guys

i'm training tesseract-lstm with 
https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00
i have some problem using "tesstrain.sh"

When creating train data, this website used tesstrain.sh this way.

  src/training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng 
--linedata_only \

--noextract_font_properties --langdata_dir ../langdata \

  --tessdata_dir ./tessdata --output_dir ~/tesstutorial/engtrain


So my code is below.

tesstrain.sh --fonts_dir /usr/share/fonts \

--lang kor \

--linedata_only \

--noextract_font_properties \

--langdata_dir ../langdata-master \

--tessdata_dir tessdata/tessdata_fast/ \

--output_dir kortrain

My language is* "kor" *not "eng" ...
when i  executed those script, i got unknown error like this.

=== Starting training for language 'kor'

/usr/local/bin/language-specific.sh: 줄 1125: FONTS: unbound variable


and i checked this error line in language-specific.sh.

1124     kor ) MEAN_COUNT="20"

1125           WORD_DAWG_FACTOR=0.015

1126           NUMBER_DAWG_FACTOR=0.05

1127           TRAINING_DATA_ARGUMENTS+=" --infrequent_ratio=10000"

1128           TRAINING_DATA_ARGUMENTS+=" --desired_bigrams="

1129           GENERATE_WORD_BIGRAMS=0

1130           FILTER_ARGUMENTS="--charset_filter=kor --segmenter_lang=kor"

1131           test -z "$FONTS" && FONTS=( "${KOREAN_FONTS[@]}" ) ;;

 312 KOREAN_FONTS=( \

 313     "Arial Unicode MS" \

 314     "Arial Unicode MS Bold" \

 315     "Baekmuk Batang Patched" \

 316     "Baekmuk Batang" \

 317     "Baekmuk Dotum" \

 318     "Baekmuk Gulim" \

 319     "Baekmuk Headline" \

 320     )

I installed perfectly korean_fonts using ttf_mscorefonts_installer, etc.. 
but i dont know why this error happens..

Anyone help me !


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2b7bbf45-4240-411b-bd4a-87c46fdcea5a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to