hi I had this problem before. I think that you make a mistake in addressing traineddata. you must give traineddata's address that made by tesstrain.sh. Good luck.
On Friday, January 25, 2019 at 7:04:36 AM UTC+3:30, 易鑫 wrote: > > Hello,everyone: > I am a new user of tesseract 4.0.Now I follow the > instructions(*https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 > > <https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00>)* > to training lstm model. > > By the way,my environment is Ubuntu16.04 and I compile the tessract 4.0 by > myself.I met some problems. > > I follow these steps. > 1.I run this command: > > src/training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng > --linedata_only \ > --noextract_font_properties --langdata_dir ../langdata \ > --tessdata_dir ./tessdata \ > --fontlist "Impact Condensed" --output_dir ~/tesstutorial/engeval > > > It is okay. > > 2.I run this command > > mkdir -p ~/tesstutorial/engoutput*training/lstmtraining* --debug_interval 100 > \ > --traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \ > --net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]' \ > --model_output ~/tesstutorial/engoutput/base --learning_rate 20e-4 \ > --train_listfile ~/tesstutorial/engtrain/eng.training_files.txt \ > --eval_listfile ~/tesstutorial/engeval/eng.training_files.txt \ > --max_iterations 5000 &>~/tesstutorial/engoutput/basetrain.log > > Here,I am confused,because currently I am in the tesseract directory, *I can > not find training folder under this directory.* > > and I think after I install the tesseract successfully,the system can > recognize the lstmtraining command,so I use this command instead. > > *lstmtraining* --debug_interval 100 \ > --traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \ > --net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]' \ > --model_output ~/tesstutorial/engoutput/base --learning_rate 20e-4 \ > --train_listfile ~/tesstutorial/engtrain/eng.training_files.txt \ > --eval_listfile ~/tesstutorial/engeval/eng.training_files.txt \ > --max_iterations 5000 > > There is an error. > > *mgr_.Init(traineddata_path.c_str()):Error:Assert failed:in file > ../../src/lstm/lstmtrainer.h, line 110 > Segmentation fault (core dumped)* > > *I look the source code in **lstmtrainer.h* > > 102 // assumed that the character set is to be re-mapped from > old_traineddata to > 103 // the new, with consequent change in weight matrices etc. > 104 bool TryLoadingCheckpoint(const char* filename, const char* > old_traineddata); > 105 > 106 // Initializes the character set encode/decode mechanism directly from a > 107 // previously setup traineddata containing dawgs, UNICHARSET and > 108 // UnicharCompress. Note: Call before InitNetwork! > 109 void InitCharSet(const std::string& traineddata_path) {*110 > ASSERT_HOST(mgr_.Init(traineddata_path.c_str()));* > 111 InitCharSet(); > 112 } > 113 void InitCharSet(const TessdataManager& mgr) { > 114 mgr_ = mgr; > 115 InitCharSet(); > 116 } > > I don't know how to solve the problem.Is anyone can help me.Thanks in > advance.Sorry for my poor english. > > > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/9e042051-4fcf-4658-8bda-07f0023214b4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

