in my opinion error is for font-type, for some font there is no error but for some other fonts there is error
On Mon, Jul 2, 2018 at 9:15 AM, john <irrang...@gmail.com> wrote: > I use tesseract 4.0.0-beta.1. downloaded from this link (UB mannheim) > <https://github.com/UB-Mannheim/tesseract/tree/v4.0.0-beta.1.20180414> > > On Saturday, June 30, 2018 at 7:13:30 PM UTC+4:30, shree wrote: >> >> Also check that there is no tab or other unprintable character in your >> training text. >> >> Which version of tesseract are you using? show output of >> >> tesseract -v >> >> >> On Sat, Jun 30, 2018 at 8:04 PM Shree Devi Kumar <shree...@gmail.com> >> wrote: >> >>> Then there must be a mismatch between the unicharset you are using and >>> the training text. eg. check whether the copyright symbol is in your >>> unicharset. >>> >>> On Sat, Jun 30, 2018 at 4:48 PM john <irra...@gmail.com> wrote: >>> >>>> I saw that link. this error occured many times,how can i prevent that? >>>> >>>> On Saturday, June 30, 2018 at 3:17:26 PM UTC+4:30, shree wrote: >>>>> >>>>> see https://github.com/tesseract-ocr/tesseract/wiki/Training >>>>> Tesseract-4.00#error-messages-from-training >>>>> >>>>> On Sat, Jun 30, 2018 at 3:23 PM john <irra...@gmail.com> wrote: >>>>> >>>>>> Encoding of string failed! Failure bytes: ffffffc2 ffffffa9 20 >>>>>> ffffffd8 ffffffa8 ffffffd8 ffffffa7 ffffffd8 ffffffae ffffffd8 ffffffaa >>>>>> ffffffd9 ffffff86 ffffffd8 ffffffa7 20 ffffffd9 ffffff84 ffffffd8 >>>>>> ffffffa7 >>>>>> ffffffd8 ffffffa4 ffffffd8 ffffffb3 20 ffffffdb ffffff8c ffffffd9 >>>>>> ffffff86 >>>>>> ffffffd8 ffffffa7 ffffffd8 ffffffb1 ffffffdb ffffff8c ffffffd8 ffffffa7 >>>>>> 20 >>>>>> ffffffd8 ffffffa7 ffffffd8 ffffffa8 20 ffffffd8 ffffffaa ffffffd8 >>>>>> ffffffa8 >>>>>> ffffffd8 ffffffab ffffffd9 ffffff87 20 ffffffd8 ffffffaf ffffffd8 >>>>>> ffffffa7 >>>>>> ffffffd9 ffffff81 ffffffd8 ffffffaa ffffffd8 ffffffb3 ffffffd8 ffffffa7 >>>>>> 20 >>>>>> ffffffd9 ffffff86 ffffffdb ffffff8c ffffffd9 ffffff86 ffffffda ffffff86 >>>>>> ffffffd9 ffffff85 ffffffd9 ffffff87 20 ffffffd9 ffffff82 ffffffd9 >>>>>> ffffff84 >>>>>> ffffffd8 ffffffb7 ffffffd9 ffffff85 >>>>>> Can't encode transcription: '۱۹ 2006© باختنا لاؤس یناریا اب تبثه >>>>>> دافتسا نینچمه قلطم' in language '' >>>>>> ^C >>>>>> >>>>>> when I finetune network for fas language i see top error? >>>>>> what is wrong with training? >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to tesseract-oc...@googlegroups.com. >>>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/11d5277e-2ef >>>>>> 1-4ae9-8cb3-3f38290c1dfc%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/11d5277e-2ef1-4ae9-8cb3-3f38290c1dfc%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> ____________________________________________________________ >>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-oc...@googlegroups.com. >>>> To post to this group, send email to tesser...@googlegroups.com. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit https://groups.google.com/d/ms >>>> gid/tesseract-ocr/bb5696d3-f251-4181-a1a2-dcd6b0bbdf62%40goo >>>> glegroups.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/bb5696d3-f251-4181-a1a2-dcd6b0bbdf62%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >>> -- >>> >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >> >> >> -- >> >> ____________________________________________________________ >> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/tesseract-ocr/fb051eec-930c-4114-b2d7-a574aa6e79b5% > 40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/fb051eec-930c-4114-b2d7-a574aa6e79b5%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAH8gkc9V_Ocb5S-Aq%2BaHP%3DTXBZcfxCBJ2v2XbRdU8mMpzvNJTg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.