Yes, I did the following command in tesseract/training directory: lstmtraining --stop_training --continue_from ../result/mylangoutput/base_checkpoint --traineddata ../result/mylangcombine/mylang/mylang.traineddata --model_output ../result/mylangoutput/mylang.traineddata
On Monday, January 8, 2018 at 7:36:50 PM UTC+7, shree wrote: > > Did you use --stop_training flag at the end? > > ShreeDevi > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > On Mon, Jan 8, 2018 at 5:51 PM, <[email protected] <javascript:>> wrote: > >> Hi all, >> >> I am doing my project using Tesseract v4.00, and always getting the >> traineddata output in the same size after training with my own data. >> I suppose that I did not do the steps correctly.. >> >> The only data that I provided were: >> 1. training_text >> 2. puncs (I just reduced the general punc as provided in tesseract github) >> 3. numbers >> 4. wordlists (I made various wordlists for several training, ranging >> between 100.000 - 2.000.000) >> 5. font name (I also made various fonts for several training, ranging >> between 1 - 20 fonts) >> >> The steps that I did were: >> 1. Made tiff file, unicharset and other complement data using tesstrain.sh >> 2. Made tiff file, unicharset and other complement data using >> tesstrain.sh for evaluation >> 3. Combined unicharset, wordlists, puncs, numbers and version_str to >> create started traineddata using combine_lang_data ( I am still not >> confident with the value of version_str though) >> 4. Trained data using lstmtraining >> 5. Combined all output file using lstmtraining --continue_from ... >> >> Yet, all of my training ended with same size which is 10.5MB.. >> Did I do all my steps correctly? >> >> Once, I also trained with modifying WORD_DAWG_FACTOR in >> language_spesific.sh to 0 and 1, because I want to read the text and match >> 100% with my wordlists. But, the result also did not satisfy me, some words >> are not in my wordlists such as "USISUSISU". >> Do you know whats the cause? >> >> I really appreciate if anyone can help or suggest any solution. >> Thankyou !! >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/b6ca74b2-1e50-44cb-93f6-586fcd26cec5%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/b6ca74b2-1e50-44cb-93f6-586fcd26cec5%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8ef2e463-9fd8-48c2-9498-19fb2cd32628%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

