Please read the complete error message: it's telling you exactly where the
problem is.

I think you are using "fancy double quotes" or something like that rather
than the normal ones.

Are you doing cut and paste from some word processor? This is probably
causing all the errors...



2018-07-23 9:48 GMT+02:00 Jennil Thiyam <[email protected]>:

> I tried using Lohit Bengali and here is the command
>
> /usr/share/tesseract-ocr/./tesstrain.sh --fonts_dir /usr/share/fonts
> --lang ben --linedata_only --noextract_font_properties --langdata_dir
> /home/jennil/Desktop/pro/langdata-master --tessdata_dir
> /usr/share/tesseract-ocr/4.00/tessdata --output_dir
> /home/jennil/Desktop/pro/output/ben_output --fontlist “Lohit Bengali”
>
> and the error i got is
>
> == Starting training for language 'ben'
> [Mon Jul 23 01:18:01 EDT 2018] /usr/bin/text2image
> --fonts_dir=/usr/share/fonts --font=“Lohit 
> --outputbase=/tmp/font_tmp.zAepRNq6Yo/sample_text.txt
> --text=/tmp/font_tmp.zAepRNq6Yo/sample_text.txt
> --fontconfig_tmpdir=/tmp/font_tmp.zAepRNq6Yo
> Could not find font named “Lohit.
> Pango suggested font FreeMono.
> Please correct --font arg.
>
> === Phase I: Generating training images ===
> Rendering using “Lohit
> Rendering using Bengali”
> [Mon Jul 23 01:18:16 EDT 2018] /usr/bin/text2image
> --fontconfig_tmpdir=/tmp/font_tmp.zAepRNq6Yo --fonts_dir=/usr/share/fonts
> --strip_unrenderable_words --leading=32 --char_spacing=0.0 --exposure=0
> --outputbase=/tmp/tmp.abQfzSYB19/ben/ben.Bengali”.exp0 --max_pages=3
> --font=Bengali” --text=/home/jennil/Desktop/pro/langdata-master/ben/ben.
> training_text
> [Mon Jul 23 01:18:16 EDT 2018] /usr/bin/text2image
> --fontconfig_tmpdir=/tmp/font_tmp.zAepRNq6Yo --fonts_dir=/usr/share/fonts
> --strip_unrenderable_words --leading=32 --char_spacing=0.0 --exposure=0
> --outputbase=/tmp/tmp.abQfzSYB19/ben/ben.“Lohit.exp0 --max_pages=3
> --font=“Lohit --text=/home/jennil/Desktop/pro/langdata-master/ben/ben.
> training_text
> Could not find font named Bengali”.
> Pango suggested font FreeMono.
> Please correct --font arg.
> Could not find font named “Lohit.
> Pango suggested font FreeMono.
> Please correct --font arg.
> ERROR: /tmp/tmp.abQfzSYB19/ben/ben.Bengali”.exp0.box does not exist or is
> not readable
> ERROR: /tmp/tmp.abQfzSYB19/ben/ben.“Lohit.exp0.box does not exist or is
> not readable
> ERROR: /tmp/tmp.abQfzSYB19/ben/ben.“Lohit.exp0.box does not exist or is
> not readable
>
> please help me out *shreeshrii*
> I read the link, but still i got this confusion about the fonts...the
> lohit bengali font is already in the system, then why this thing is
> happening
>
>
> some of the fonts that showed up when i wrote *text2image --fonts_dir
> /usr/share/fonts --list_available_fonts*are
>
> 01: Liberation Serif Italic
> 102: Likhan Medium
> 103: Lohit Assamese
> *104: Lohit Bengali*
> 105: Lohit Devanagari
> 106: Lohit Gujarati
> 107: Lohit Gurmukhi
> 108: Lohit Kannada
> 109: Lohit Malayalam
> 110: Lohit Odia
> 111: Lohit Tamil
> 112: Lohit Tamil Classical
> 113: Lohit Telugu
> 114: Loma
> 115: Loma Bold
> 116: Loma Bold Oblique
> 117: Loma Oblique
> 118: Manjari
> 119: Manjari Bold
> 120: Manjari Thin
> 121: Meera
> 122: Mitra Mono
> ...
>
> Lohit Bengali is in it, so please tell me why is the error, do i need to
> do something others too?
>
>
> On Sun, Jul 22, 2018 at 11:00 AM, Shree Devi Kumar <[email protected]>
> wrote:
>
>> See https://github.com/tesseract-ocr/tesseract/wiki/Fonts
>>
>> On Sun 22 Jul, 2018, 8:20 PM Jennil Thiyam, <[email protected]>
>> wrote:
>>
>>> you guys help me...now there is no error, but i don't know about the
>>> fonts, i try to train the bengali in "lohit-bengali" font thinking its
>>> already in the FONTS folder, but i got
>>>
>>> === Starting training for language 'ben'
>>> [Sun Jul 22 10:48:33 EDT 2018] /usr/bin/text2image
>>> --fonts_dir=/usr/share/fonts/truetype --font=“lohit-bengali”
>>> --outputbase=/tmp/font_tmp.z6y7AIvqyI/sample_text.txt
>>> --text=/tmp/font_tmp.z6y7AIvqyI/sample_text.txt
>>> --fontconfig_tmpdir=/tmp/font_tmp.z6y7AIvqyI
>>> Could not find font named “lohit-bengali”.
>>> Pango suggested font FreeMono.
>>> Please correct --font arg.
>>>
>>> === Phase I: Generating training images ===
>>> Rendering using “lohit-bengali”
>>> [Sun Jul 22 10:48:34 EDT 2018] /usr/bin/text2image
>>> --fontconfig_tmpdir=/tmp/font_tmp.z6y7AIvqyI
>>> --fonts_dir=/usr/share/fonts/truetype --strip_unrenderable_words
>>> --leading=32 --char_spacing=0.0 --exposure=0 --outputbase=/tmp/tmp.pBWa4wRH
>>> mt/ben/ben.“lohit-bengali”.exp0 --max_pages=3 --font=“lohit-bengali”
>>> --text=/home/jennil/Desktop/pro/langdata-master/ben/ben.training_text
>>> Could not find font named “lohit-bengali”.
>>> Pango suggested font FreeMono.
>>> Please correct --font arg.
>>> ERROR: /tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0.box does not
>>> exist or is not readable
>>> ERROR: /tmp/tmp.pBWa4wRHmt/ben/ben.“lohit-bengali”.exp0.box does not
>>> exist or is not readable
>>>
>>> SO , please tell is all the fonts which are in this FONTS folder are
>>> already installed to tesseract or not?
>>>
>>>
>>> On Sun, Jul 22, 2018 at 7:15 AM, Jennil Thiyam <[email protected]>
>>> wrote:
>>>
>>>> Oh sorry for the mistake...I put two dashes, still it says
>>>> unrecognised..
>>>>
>>>> On Sun 22 Jul, 2018, 4:27 PM Shree Devi Kumar, <[email protected]>
>>>> wrote:
>>>>
>>>>> needs two dashes,
>>>>>
>>>>> On Sun, Jul 22, 2018 at 12:29 PM <[email protected]> wrote:
>>>>>
>>>>>> hello again, i modified the error in the way you said and there is no
>>>>>> error. but now the same error of unrecognised is occured in output_dir.
>>>>>> the error is
>>>>>> ERROR: Unrecognized argument -–output_dir
>>>>>>
>>>>>> my command is
>>>>>>
>>>>>> /usr/share/tesseract-ocr/./tesstrain.sh \
>>>>>>
>>>>>> --fonts_dir /usr/share/fonts \
>>>>>>
>>>>>> --lang ben \
>>>>>>
>>>>>> --linedata_only \
>>>>>>
>>>>>> --noextract_font_properties \
>>>>>>
>>>>>> --langdata_dir /home/jennil/Desktop/pro/langdata-master/ben \
>>>>>>
>>>>>> --tessdata_dir /usr/share/tesseract-ocr/4.00/tessdata \
>>>>>>
>>>>>> -–output_dir /home/jennil/Desktop/pro/output/ben_output \
>>>>>>
>>>>>> --fontlist “Lohit Bengali”
>>>>>>
>>>>>>
>>>>>> please do help
>>>>>>
>>>>>> On Saturday, July 21, 2018 at 1:42:41 PM UTC-4, shree wrote:
>>>>>>>
>>>>>>> --linedata_only\
>>>>>>>
>>>>>>> You need space before the continuation mark \
>>>>>>>
>>>>>>> On Sat 21 Jul, 2018, 10:00 PM , <[email protected]> wrote:
>>>>>>>
>>>>>>>> can u please point out the place where to put the space
>>>>>>>>
>>>>>>>> thank you
>>>>>>>>
>>>>>>>> On Saturday, July 21, 2018 at 12:12:22 PM UTC-4,
>>>>>>>> [email protected] wrote:
>>>>>>>>>
>>>>>>>>> My command is
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> usr/share/tesseract-ocr/./tesstrain.sh \
>>>>>>>>>
>>>>>>>>> --fonts_dir /usr/share/fonts \
>>>>>>>>>
>>>>>>>>> --lang ben \
>>>>>>>>>
>>>>>>>>> --linedata_only\
>>>>>>>>>
>>>>>>>>> --noextract_font_properties \
>>>>>>>>>
>>>>>>>>> --langdata_dir /home/jennil/Desktop/pro/langdata-master/ben\
>>>>>>>>>
>>>>>>>>> --tessdata_dir /usr/share/tesseract-ocr/4.00/tessdata –output_dir
>>>>>>>>> /home/jennil/Desktop/pro/output/ben_output\
>>>>>>>>>
>>>>>>>>> --fontlist “Lohit Bengali”
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> and here is the error
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *ERROR: Unrecognized argument
>>>>>>>>> --linedata_only--noextract_font_properties*
>>>>>>>>>
>>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "tesseract-ocr" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>>>>> To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/37073e8b-f62
>>>>>>>> 8-438c-b1b9-648e90c405b8%40googlegroups.com
>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/37073e8b-f628-438c-b1b9-648e90c405b8%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To post to this group, send email to [email protected].
>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/c841fc9d-e1e
>>>>>> 3-4905-a065-651320f40fa5%40googlegroups.com
>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/c841fc9d-e1e3-4905-a065-651320f40fa5%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> ____________________________________________________________
>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWXu38
>>>>> 3FWz10VrpW__WW-eJpp5A%2BXNgRPLuDOFzxsEt6A%40mail.gmail.com
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWXu383FWz10VrpW__WW-eJpp5A%2BXNgRPLuDOFzxsEt6A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/tesseract-ocr/CAJxgoof-ysEQ%2BKfYC%2Bxzd31pCeWwfEGk0J6zp
>>> 1Oi0LD69uBc2g%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAJxgoof-ysEQ%2BKfYC%2Bxzd31pCeWwfEGk0J6zp1Oi0LD69uBc2g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/tesseract-ocr/CAG2NduXGxBoxwOH1sf6WgAPEY-hwBJoJ75bEHzPbU
>> 7GKrobUNA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXGxBoxwOH1sf6WgAPEY-hwBJoJ75bEHzPbU7GKrobUNA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/CAJxgoof0UyOER3mb8BHrZpfJATyEOyKWqhxN1zG-fOneDj%2Buig%
> 40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAJxgoof0UyOER3mb8BHrZpfJATyEOyKWqhxN1zG-fOneDj%2Buig%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLzzqHtKGXmQMh1Eg4ptqWOqMvG9psBh4MRf-e9bYLnTuw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to