Am trying to fine tune tesseract

but I keep getting the error *mgr_.Init(traineddata_path.c_str()):Error:Assert 
failed:in file ../../src/lstm/lstmtrainer.h, line 110  *on the training 
statement.

My script looks as follows

cd /home/sw/repo/tesseract-ocr
  
mkdir -p ~/tesstutorial/
mkdir -p ~/tesstutorial/trainplusminus
mkdir -p ~/tesstutorial/evalplusminus


src/training/tesstrain.sh  --fontlist "Times New Roman" --lang eng 
--linedata_only   --noextract_font_properties --langdata_dir 
/home/sw/repo/langdata   --tessdata_dir /home/sw/repo/tessdata --output_dir 
~/tesstutorial/trainplusminus

src/training/tesstrain.sh  --fontlist "Times New Roman" --lang eng 
--linedata_only   --noextract_font_properties --langdata_dir 
/home/sw/repo/langdata/eng   --tessdata_dir /home/sw/repo/tessdata  
 --output_dir ~/tesstutorial/evalplusminus


*#eng.lstm file gets extracted correctly*
src/training/combine_tessdata -e /home/sw/repo/tessdata/eng.traineddata  
 ~/tesstutorial/trainplusminus/eng.lstm

*#this command fails and throws the error*
src/training/lstmtraining --model_output 
~/tesstutorial/trainplusminus/plusminus \
   --continue_from ~/tesstutorial/trainplusminus/eng.lstm  \
   --traineddata ~/tesstutorial/trainplusminus/eng/eng.traineddata   \
   --old_traineddata /home/sw/repo/tessdata/eng.traineddata   \
   --train_listfile ~/tesstutorial/trainplusminus/eng.training_files.txt   \
   --max_iterations 400
   

src/training/lstmtraining --stop_training \
  --continue_from ~/tesstutorial/trainplusminus/plusminus_checkpoint \
  --traineddata ~/tesstutorial/trainplusminus/eng/eng.traineddata \
  --model_output ~/tesstutorial/eng_final.traineddata
  
cp ~/tesstutorial/eng_final.traineddata 
/usr/share/tesseract/4/tessdata/eng.traineddata


I have download the eng.traineddata from "Best" repo though, anyone can 
help ?

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/00310d99-1fc9-402f-b0fa-d048486d77b2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to