Try replacing a layer - you may need larger training_text and more iterations
lstmtraining --model_output ~/tesstutorial/chi_sim_tuned_from_chi_sim/chi_sim_layer \ --continue_from ~/tesstutorial/chi_sim_tuned_from_chi_sim/chi_sim.lstm \ --traineddata ~/tesstutorial/chi_sim_train/chi_sim/chi_sim.traineddata \ --append_index 5 --net_spec '[Lfx192 O1c1]' \ --train_listfile ~/tesstutorial/chi_sim_train/chi_sim.training_files.txt \ --max_iterations 30000 On Mon, Mar 25, 2019 at 4:14 PM 易鑫 <[email protected]> wrote: > Hello,everyone: > I have focus the training eng + chi_sim for several days,but one urgent > issue confused me. I have ask the questions before,but do not get good > reply,so I ask the questions again. Sorry for disturbing you. > > My steps is as follows: > > src/training/tesstrain.sh --fonts_dir /usr/share/fonts --training_text > ../training_data/chi_sim_tuned.txt \ > --langdata_dir ../langdata --tessdata_dir ./tessdata --lang chi_sim > --linedata_only --noextract_font_properties --exposures "0" \ > --workspace_dir ./share/workspace/tmp \ > --save_box_tiff \ > --fontlist "NSimSun" \ > "Times New Roman" \ > "Arial Unicode MS" \ > "SimSun" \ > "Merchant Copy" \ > "Merchant Copy Doublesize" \ > "Noto Sans CJK SC" \ > "Noto Sans Mono CJK SC" \ > --output_dir ~/tesstutorial/chi_sim_train \ > --overwrite > > > mkdir -p ~/tesstutorial/chi_sim_tuned_from_chi_sim > > > > combine_tessdata -e ../tessdata_best/chi_sim.traineddata > ~/tesstutorial/chi_sim_tuned_from_chi_sim/chi_sim.lstm > > > lstmtraining --model_output > ~/tesstutorial/chi_sim_tuned_from_chi_sim/chi_sim_tuned \ > --continue_from ~/tesstutorial/chi_sim_tuned_from_chi_sim/chi_sim.lstm \ > --traineddata ~/tesstutorial/chi_sim_train/chi_sim/chi_sim.traineddata \ > --old_traineddata ../tessdata_best/chi_sim.traineddata \ > --train_listfile ~/tesstutorial/chi_sim_train/chi_sim.training_files.txt \ > --max_iterations 3000 > > lstmtraining --stop_training --continue_from > ~/tesstutorial/chi_sim_tuned_from_chi_sim/chi_sim_tuned_checkpoint \ > --traineddata > ~/tesstutorial/chi_sim_train/chi_sim/chi_sim.traineddata --model_output > ~/tesstutorial/chi_sim_tuned_from_chi_sim/chi_sim_tuned.traineddata > > the train_text file is in the attachfile. > > > What confused me is that: the result contains some characters that do not > in the train_text file.(only chi_sim character have the problem,eng is > ok)。 > > Can anyone help me?Thanks a lot. > I also upload image and result file. Thanks in advance. > > Thank you. > > > > > > > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/4af9e1d1-218a-4a36-8a77-1b4619b53205%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/4af9e1d1-218a-4a36-8a77-1b4619b53205%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWGXhoF45GLgxiaRUGcVkefExAkimmp4Oh5P7m4Sr2riw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

