Please tell and help me how can i get LSTM.train config file.. as i need to work on Tesseract 4 only... dont have other option
On Wednesday, April 5, 2017 at 1:59:56 PM UTC+5:30, shree wrote: > > You do not have the LSTM.train config file. > > - excuse the brevity, sent from mobile > > On 05-Apr-2017 1:55 PM, <[email protected] <javascript:>> wrote: > >> After u have said, >> >> I tried in two ways and i am stuck at lstm step: >> >> Training >> >> command used: >> >> /home/p/Documents/T/tesseract-master/training/lstmtraining -U >> /home/p/Documents/T/img_frm_3/eng.unicharset \ >> > --script_dir /home/p/Documents/T/TESS_4_ALPHA/langdata-master >> --debug_interval 100 \ >> > --net_spec '[1,36,0,1 Ct5,5,16 Mp3,3 Lfys64 Lfx128 Lrx128 Lfx256 >> O1c105]' \ >> > --model_output /home/p/Documents/T/ \ >> > --train_listfile /home/p/Documents/T/img_frm_3/eng.ArialBold.exp0.txt >> \ >> > --eval_listfile /home/p/Documents/T/img_frm_3/eng.ArialBold.exp0.txt \ >> > --max_iterations 5000 &>/home/p/Documents/T/basetrain.log >> >> tail -f basetrain.log >> Error getting is : >> >> >> Deserialize header failed: BnO. 005 SUBHISHIs TOWN CENTRE >> Deserialize header failed: MOKILA SHAKARPALLY >> Deserialize header failed: PHONE: 040-8989898989 >> Load of page 0 failed! >> Load of images failed!! >> Deserialize header failed: TIN: 8989898989 >> Deserialize header failed: Station 1D: 01 Time: 03:26:46 PM >> Deserialize header failed: CASHIER ID:; 3001 Date: 21-02-2017 >> Deserialize header failed: (null) >> Deserialize header failed: (null) >> >> >> >> >> >> >> >> >> Fine tuning: >> >> command used:- >> >> /home/plianto/Documents/Tvat/tesseract-master/training/tesstrain.sh >> --fonts_dir /usr/share/fonts --lang eng --linedata_only \ >> --training_text >> /home/plianto/Documents/Tvat/img_frm_3/eng.ArialBold.exp0.txt \ >> --langdata_dir >> /home/plianto/Documents/Tvat/TESS_4_ALPHA/langdata-master --tessdata_dir >> /usr/share/tesseract-ocr/tessdata \ >> --fontlist "Arial Bold" \ >> --output_dir /home/plianto/Documents/Tvat/engoutput/ >> >> error: >> >> === Phase E: Generating lstmf files === >> Using TESSDATA_PREFIX=/usr/share/tesseract-ocr/tessdata >> [Wed Apr 5 13:53:05 IST 2017] /usr/local/bin/tesseract >> /tmp/tmp.KTk3WgBTWk/eng/eng.Arial_Bold.exp0.tif >> /tmp/tmp.KTk3WgBTWk/eng/eng.Arial_Bold.exp0 lstm.train >> read_params_file: Can't open lstm.train >> Tesseract Open Source OCR Engine v4.00.00alpha with Leptonica >> Page 1 >> ERROR: /tmp/tmp.KTk3WgBTWk/eng/eng.Arial_Bold.exp0.lstmf does not exist >> or is not readable >> >> >> >> >> >> >> >> >> >> On Wednesday, April 5, 2017 at 9:07:40 AM UTC+5:30, shree wrote: >>> >>> Read >>> >>> https://github.com/tesseract-ocr/tesseract/wiki/4.0-with-LSTM >>> >>> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 >>> >>> >>> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00---Finetune >>> >>> >>> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00---Replacing-Top-Layer-Example >>> >>> >>> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00---Replace-Top-Layer >>> >>> and >>> >>> https://github.com/tesseract-ocr/tesseract/wiki/Documentation >>> >>> https://github.com/tesseract-ocr/tesseract/wiki/Fonts >>> >>> https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage >>> >>> https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality >>> >>> https://github.com/tesseract-ocr/tesseract/wiki/FAQ >>> >>> >>> >>> >>> ShreeDevi >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >>> On Wed, Apr 5, 2017 at 12:54 AM, <[email protected]> wrote: >>> >>>> Can you please post some experiences in this post, as there are no >>>> posts to train tesseract 4. >>>> >>>> 1)And also, is there any way to add the new trained data file to old >>>> trained data file, without replacing the old file. >>>> 2)If we dont know what font we may get in our images, then how should >>>> we proceed in training the tessract >>>> >>>> On Tuesday, April 4, 2017 at 9:27:06 PM UTC+5:30, Saurabh Srivastav >>>> wrote: >>>>> >>>>> Yes, i trained my tesseract for eng font and make them read the >>>>> characters from image. >>>>> >>>>>> thanks, >>>>>>> Saurabh Srivastav >>>>>>> >>>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/9c88494c-6d80-4b31-b247-dbbacd48bc19%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/9c88494c-6d80-4b31-b247-dbbacd48bc19%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/6e9e098f-da2f-4c4a-a866-24f9938bdb1b%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/6e9e098f-da2f-4c4a-a866-24f9938bdb1b%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/be33ec3c-3a67-4e90-a85d-1d43f4d27b3f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

