The issue got resolved. libtiff was missing in the system so not working with tif files
On Friday, April 23, 2021 at 12:18:43 AM UTC+5:30 [email protected] wrote: > I am facing the same issue. I have used following command: > /tesstrain.sh --fonts_dir /usr/share/fonts/ --lang eng --linedata_only > --noextract_font_properties --exposures "0" --langdata_dir > /home/administrator/Downloads/tesseract-4.0.0/langdata --tessdata_dir > /home/administrator/Downloads/tesseract-4.0.0/tessdata --output_dir > /home/administrator/pooja/output --fontlist 'FreeMono' > > It is giving same error. > === Starting training for language 'eng' > [Fri Apr 23 00:13:06 IST 2021] /usr/bin/text2image > --fonts_dir=/usr/share/fonts/ --font=FreeMono > --outputbase=/tmp/font_tmp.7XXGMDw4DE/sample_text.txt > --text=/tmp/font_tmp.7XXGMDw4DE/sample_text.txt > --fontconfig_tmpdir=/tmp/font_tmp.7XXGMDw4DE > Rendered page 0 to file /tmp/font_tmp.7XXGMDw4DE/sample_text.txt.tif > > === Phase I: Generating training images === > Rendering using FreeMono > [Fri Apr 23 00:13:09 IST 2021] /usr/bin/text2image > --fontconfig_tmpdir=/tmp/font_tmp.7XXGMDw4DE --fonts_dir=/usr/share/fonts/ > --strip_unrenderable_words --leading=32 --xsize=3600 --char_spacing=0.0 > --exposure=0 --outputbase=/tmp/eng-2021-04-23.RTo/eng.FreeMono.exp0 > --max_pages=0 --font=FreeMono > --text=/home/administrator/Downloads/tesseract-4.0.0/langdata/eng/eng.training_text > Rendered page 0 to file /tmp/eng-2021-04-23.RTo/eng.FreeMono.exp0.tif > Rendered page 1 to file /tmp/eng-2021-04-23.RTo/eng.FreeMono.exp0.tif > *ERROR: /tmp/eng-2021-04-23.RTo/eng.FreeMono.exp0.tif does not exist or is > not readable* > > I have checked for* lstm.train* file. It is present. Please help to > resolve it. > > > On Monday, September 3, 2018 at 2:50:11 AM UTC+5:30 Shandigutt wrote: > >> Thank you Shree. Now it works fine >> >> >> On Sunday, September 2, 2018 at 6:41:28 AM UTC+3, shree wrote: >> >>> > read_params_file: Can't open lstm.train >>> >>> lstm.train is a config file which is not found. >>> >>> It is there in tesseract/tessdata/configs >>> >>> Make sure it is there in your tessdata directory or your path and can be >>> found. >>> >> On Sun, Sep 2, 2018 at 3:40 AM, Shandigutt <[email protected]> wrote: >>> >> Hi, >>>> >>>> I was trying to create LSTM training data using tesstrain.sh. I got the >>>> below error. Can somebody explain me what has gone wrong, >>>> >>>> *Command I used:* >>>> ./src/training/tesstrain.sh --fonts_dir ../Support/font --lang sin >>>> --linedata_only \ >>>> --noextract_font_properties --langdata_dir ../langdata \ >>>> --tessdata_dir ../tessdata --output_dir ../training/sintrain >>>> --fontlist "BhashitaComplex" --training_text >>>> ../langdata/sin/sin.training_text >>>> >>>> *Extract of the output:* >>>> === Phase E: Generating lstmf files === >>>> Using TESSDATA_PREFIX=../tessdata >>>> [2018 සැප්තැම්බර් 1 වැනි සෙනසුරාදා 21:41:25 +0300] >>>> /usr/local/bin/tesseract >>>> /tmp/sin-2018-09-01.E4T/sin.BhashitaComplex.exp0.tif >>>> /tmp/sin-2018-09-01.E4T/sin.BhashitaComplex.exp0 --psm 6 lstm.train >>>> ../langdata/sin/sin.config >>>> read_params_file: Can't open lstm.train >>>> Tesseract Open Source OCR Engine v4.0.0-beta.4-74-gd8237 with Leptonica >>>> Page 1 >>>> Page 2 >>>> Page 3 >>>> ERROR: /tmp/sin-2018-09-01.E4T/sin.BhashitaComplex.exp0.lstmf does not >>>> exist or is not readable >>>> >>>> *For the complete output please see the attached err.txt* >>>> >>>> *After executing the command I checked the tmp directory it created. It >>>> was shown as below,* >>>> >>>> tharaka@tharaka-laptop-ubuntu:~$ cd /tmp/sin-2018-09-01.E4T/ >>>> tharaka@tharaka-laptop-ubuntu:/tmp/sin-2018-09-01.E4T$ ll >>>> total 776 >>>> drwx------ 2 tharaka tharaka 4096 සැප් 1 21:41 ./ >>>> drwxrwxrwt 50 root root 4096 සැප් 2 00:10 ../ >>>> -rw-r--r-- 1 tharaka tharaka 249413 සැප් 1 21:41 >>>> sin.BhashitaComplex.exp0.box >>>> -rw-r--r-- 1 tharaka tharaka 436290 සැප් 1 21:41 >>>> sin.BhashitaComplex.exp0.tif >>>> -rw-r--r-- 1 tharaka tharaka 9099 සැප් 1 23:27 >>>> sin.BhashitaComplex.exp0.txt >>>> -rw-r--r-- 1 tharaka tharaka 6543 සැප් 1 21:41 sin.unicharset >>>> -rw-r--r-- 1 tharaka tharaka 3053 සැප් 1 21:41 sin.xheights >>>> -rw-r--r-- 1 tharaka tharaka 71704 සැප් 1 23:27 tesstrain.log >>>> tharaka@tharaka-laptop-ubuntu:/tmp/sin-2018-09-01.E4T$ >>>> >>>> *My tesseract version:* >>>> tesseract 4.0.0-beta.4-74-gd8237 >>>> leptonica-1.77.0 >>>> libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : >>>> zlib 1.2.11 >>>> Found SSE >>>> >>>> *My OS details,* >>>> tharaka@tharaka-laptop-ubuntu:/tmp/sin-2018-09-01.E4T$ lsb_release -a >>>> No LSB modules are available. >>>> Distributor ID: Ubuntu >>>> Description: Ubuntu 18.04.1 LTS >>>> Release: 18.04 >>>> Codename: bionic >>>> >>>> Appreciate your support on this. >>>> Thanks >>>> >>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> >>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>> >>> >>>> To post to this group, send email to [email protected]. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/7d771008-c142-4302-8b5e-e1fd130cc140%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/7d771008-c142-4302-8b5e-e1fd130cc140%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> >>> -- >>> >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/77e3f2ef-bc92-4838-827b-ff74e898ccaan%40googlegroups.com.

