the problem is still there, i saw those links but problem is still here

On Tue, Jul 3, 2018 at 12:54 AM, Shree Devi Kumar <[email protected]>
wrote:

> also see https://github.com/tesseract-ocr/tesseract/issues/549
>
>
>
> On Mon, Jul 2, 2018 at 7:45 PM Shree Devi Kumar <[email protected]>
> wrote:
>
>> You can use find_fonts with your training_text to locate the fonts to use.
>>
>> Modify the following command to match your directory setup and try
>>
>> echo "###### FIND FONTS ######"
>> # Find fonts which can render your training_text. Run `fc-cache -vf` to
>> refresh cache.
>> # You can change the minimum coverage % as needed.
>> # This process can take a while if you have a number of installed fonts.
>> # Review the generated fontlist and modify, if needed.
>> # 2000 fonts found. Use a smaller set
>>
>> nice text2image --find_fonts \
>> --fonts_dir $fonts_dir \
>> --text $langdata_dir/$Lang/$Lang.training_text \
>> --min_coverage 0.999  \
>> --render_per_font=false \
>> --outputbase $langdata_dir/$Lang/$Lang \
>> |& grep raw \
>>  | sed -e 's/ :.*/@ \\/g' \
>>  | sed -e "s/^/ '/" \
>>  | sed -e "s/@/'/g" > $langdata_dir/$Lang/$Lang.fontslist.txt
>>
>> On Mon, Jul 2, 2018 at 12:06 PM ran go <[email protected]> wrote:
>>
>>> in my opinion error is for font-type, for some font there is no error
>>> but for some other fonts there is error
>>>
>>> On Mon, Jul 2, 2018 at 9:15 AM, john <[email protected]> wrote:
>>>
>>>> I use tesseract 4.0.0-beta.1. downloaded from this link (UB mannheim)
>>>> <https://github.com/UB-Mannheim/tesseract/tree/v4.0.0-beta.1.20180414>
>>>>
>>>> On Saturday, June 30, 2018 at 7:13:30 PM UTC+4:30, shree wrote:
>>>>>
>>>>> Also check that there is no tab or other unprintable character in your
>>>>> training text.
>>>>>
>>>>> Which version of tesseract are you using? show output  of
>>>>>
>>>>> tesseract -v
>>>>>
>>>>>
>>>>> On Sat, Jun 30, 2018 at 8:04 PM Shree Devi Kumar <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Then there must be a mismatch between the unicharset you are using
>>>>>> and the training text. eg. check whether the copyright symbol is in your
>>>>>> unicharset.
>>>>>>
>>>>>> On Sat, Jun 30, 2018 at 4:48 PM john <[email protected]> wrote:
>>>>>>
>>>>>>> I saw that link. this error occured many times,how can i prevent
>>>>>>> that?
>>>>>>>
>>>>>>> On Saturday, June 30, 2018 at 3:17:26 PM UTC+4:30, shree wrote:
>>>>>>>>
>>>>>>>> see https://github.com/tesseract-ocr/tesseract/wiki/
>>>>>>>> TrainingTesseract-4.00#error-messages-from-training
>>>>>>>>
>>>>>>>> On Sat, Jun 30, 2018 at 3:23 PM john <[email protected]> wrote:
>>>>>>>>
>>>>>>>>> Encoding of string failed! Failure bytes: ffffffc2 ffffffa9 20
>>>>>>>>> ffffffd8 ffffffa8 ffffffd8 ffffffa7 ffffffd8 ffffffae ffffffd8 
>>>>>>>>> ffffffaa
>>>>>>>>> ffffffd9 ffffff86 ffffffd8 ffffffa7 20 ffffffd9 ffffff84 ffffffd8 
>>>>>>>>> ffffffa7
>>>>>>>>> ffffffd8 ffffffa4 ffffffd8 ffffffb3 20 ffffffdb ffffff8c ffffffd9 
>>>>>>>>> ffffff86
>>>>>>>>> ffffffd8 ffffffa7 ffffffd8 ffffffb1 ffffffdb ffffff8c ffffffd8 
>>>>>>>>> ffffffa7 20
>>>>>>>>> ffffffd8 ffffffa7 ffffffd8 ffffffa8 20 ffffffd8 ffffffaa ffffffd8 
>>>>>>>>> ffffffa8
>>>>>>>>> ffffffd8 ffffffab ffffffd9 ffffff87 20 ffffffd8 ffffffaf ffffffd8 
>>>>>>>>> ffffffa7
>>>>>>>>> ffffffd9 ffffff81 ffffffd8 ffffffaa ffffffd8 ffffffb3 ffffffd8 
>>>>>>>>> ffffffa7 20
>>>>>>>>> ffffffd9 ffffff86 ffffffdb ffffff8c ffffffd9 ffffff86 ffffffda 
>>>>>>>>> ffffff86
>>>>>>>>> ffffffd9 ffffff85 ffffffd9 ffffff87 20 ffffffd9 ffffff82 ffffffd9 
>>>>>>>>> ffffff84
>>>>>>>>> ffffffd8 ffffffb7 ffffffd9 ffffff85
>>>>>>>>> Can't encode transcription: '۱۹ 2006© باختنا لاؤس یناریا اب تبثه
>>>>>>>>> دافتسا نینچمه قلطم' in language ''
>>>>>>>>> ^C
>>>>>>>>>
>>>>>>>>> when I finetune network for fas language i see top error?
>>>>>>>>> what is wrong with training?
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>>> Groups "tesseract-ocr" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send an email to [email protected].
>>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>>>>>> To view this discussion on the web visit
>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/11d5277e-
>>>>>>>>> 2ef1-4ae9-8cb3-3f38290c1dfc%40googlegroups.com
>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/11d5277e-2ef1-4ae9-8cb3-3f38290c1dfc%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> ____________________________________________________________
>>>>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "tesseract-ocr" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>> To post to this group, send email to [email protected].
>>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/bb5696d3-
>>>>>>> f251-4181-a1a2-dcd6b0bbdf62%40googlegroups.com
>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/bb5696d3-f251-4181-a1a2-dcd6b0bbdf62%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> ____________________________________________________________
>>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> ____________________________________________________________
>>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/tesseract-ocr/fb051eec-930c-4114-b2d7-a574aa6e79b5%
>>>> 40googlegroups.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/fb051eec-930c-4114-b2d7-a574aa6e79b5%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/tesseract-ocr/CAH8gkc9V_Ocb5S-Aq%2BaHP%
>>> 3DTXBZcfxCBJ2v2XbRdU8mMpzvNJTg%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAH8gkc9V_Ocb5S-Aq%2BaHP%3DTXBZcfxCBJ2v2XbRdU8mMpzvNJTg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>> --
>>
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>
>
> --
>
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/CAG2NduUqWiCer_Auz7yxWuerQ6C5MbEbh%
> 2BsSy37twQ%3DDtOL4WQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUqWiCer_Auz7yxWuerQ6C5MbEbh%2BsSy37twQ%3DDtOL4WQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAH8gkc_graaYuB7uv1L4o7C9pxMikzdSy2j7gbwAJdXgO76ZzQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to