you guys help me...now there is no error, but i don't know about the fonts, i try to train the bengali in "lohit-bengali" font thinking its already in the FONTS folder, but i got
=== Starting training for language 'ben' [Sun Jul 22 10:48:33 EDT 2018] /usr/bin/text2image --fonts_dir=/usr/share/fonts/truetype --font=“lohit-bengali” --outputbase=/tmp/font_tmp.z6y7AIvqyI/sample_text.txt --text=/tmp/font_tmp.z6y7AIvqyI/sample_text.txt --fontconfig_tmpdir=/tmp/font_tmp.z6y7AIvqyI Could not find font named “lohit-bengali”. Pango suggested font FreeMono. Please correct --font arg. === Phase I: Generating training images === Rendering using “lohit-bengali” [Sun Jul 22 10:48:34 EDT 2018] /usr/bin/text2image --fontconfig_tmpdir=/tmp/font_tmp.z6y7AIvqyI --fonts_dir=/usr/share/fonts/truetype --strip_unrenderable_words --leading=32 --char_spacing=0.0 --exposure=0 --outputbase=/tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0 --max_pages=3 --font=“lohit-bengali” --text=/home/jennil/Desktop/pro/langdata-master/ben/ben.training_text Could not find font named “lohit-bengali”. Pango suggested font FreeMono. Please correct --font arg. ERROR: /tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0.box does not exist or is not readable ERROR: /tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0.box does not exist or is not readable SO , please tell is all the fonts which are in this FONTS folder are already installed to tesseract or not? On Sun, Jul 22, 2018 at 7:15 AM, Jennil Thiyam <[email protected]> wrote: > Oh sorry for the mistake...I put two dashes, still it says unrecognised.. > > On Sun 22 Jul, 2018, 4:27 PM Shree Devi Kumar, <[email protected]> > wrote: > >> needs two dashes, >> >> On Sun, Jul 22, 2018 at 12:29 PM <[email protected]> wrote: >> >>> hello again, i modified the error in the way you said and there is no >>> error. but now the same error of unrecognised is occured in output_dir. >>> the error is >>> ERROR: Unrecognized argument -–output_dir >>> >>> my command is >>> >>> /usr/share/tesseract-ocr/./tesstrain.sh \ >>> >>> --fonts_dir /usr/share/fonts \ >>> >>> --lang ben \ >>> >>> --linedata_only \ >>> >>> --noextract_font_properties \ >>> >>> --langdata_dir /home/jennil/Desktop/pro/langdata-master/ben \ >>> >>> --tessdata_dir /usr/share/tesseract-ocr/4.00/tessdata \ >>> >>> -–output_dir /home/jennil/Desktop/pro/output/ben_output \ >>> >>> --fontlist “Lohit Bengali” >>> >>> >>> please do help >>> >>> On Saturday, July 21, 2018 at 1:42:41 PM UTC-4, shree wrote: >>>> >>>> --linedata_only\ >>>> >>>> You need space before the continuation mark \ >>>> >>>> On Sat 21 Jul, 2018, 10:00 PM , <[email protected]> wrote: >>>> >>>>> can u please point out the place where to put the space >>>>> >>>>> thank you >>>>> >>>>> On Saturday, July 21, 2018 at 12:12:22 PM UTC-4, [email protected] >>>>> wrote: >>>>>> >>>>>> My command is >>>>>> >>>>>> >>>>>> usr/share/tesseract-ocr/./tesstrain.sh \ >>>>>> >>>>>> --fonts_dir /usr/share/fonts \ >>>>>> >>>>>> --lang ben \ >>>>>> >>>>>> --linedata_only\ >>>>>> >>>>>> --noextract_font_properties \ >>>>>> >>>>>> --langdata_dir /home/jennil/Desktop/pro/langdata-master/ben\ >>>>>> >>>>>> --tessdata_dir /usr/share/tesseract-ocr/4.00/tessdata –output_dir >>>>>> /home/jennil/Desktop/pro/output/ben_output\ >>>>>> >>>>>> --fontlist “Lohit Bengali” >>>>>> >>>>>> >>>>>> >>>>>> and here is the error >>>>>> >>>>>> >>>>>> >>>>>> *ERROR: Unrecognized argument >>>>>> --linedata_only--noextract_font_properties* >>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To post to this group, send email to [email protected]. >>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/tesseract-ocr/37073e8b-f628-438c-b1b9-648e90c405b8% >>>>> 40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/37073e8b-f628-438c-b1b9-648e90c405b8%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/tesseract-ocr/c841fc9d-e1e3-4905-a065-651320f40fa5% >>> 40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/c841fc9d-e1e3-4905-a065-651320f40fa5%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> -- >> >> ____________________________________________________________ >> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit https://groups.google.com/d/ >> msgid/tesseract-ocr/CAG2NduWXu383FWz10VrpW__WW- >> eJpp5A%2BXNgRPLuDOFzxsEt6A%40mail.gmail.com >> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWXu383FWz10VrpW__WW-eJpp5A%2BXNgRPLuDOFzxsEt6A%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJxgoof-ysEQ%2BKfYC%2Bxzd31pCeWwfEGk0J6zp1Oi0LD69uBc2g%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

