Hi Shree, Thank you
Le lundi 30 avril 2018 16:20:44 UTC+2, shree a écrit : > Added to issue on GitHub > > https://github.com/tesseract-ocr/tesseract/issues/733 > <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Ftesseract-ocr%2Ftesseract%2Fissues%2F733&sa=D&sntz=1&usg=AFQjCNGAm2Z8f5YiRGjvYT5ikporn5imvA> > > On Thursday, April 26, 2018 at 1:35:30 PM UTC+5:30, Youcef wrote: >> >> >> I'm using master branch with tessdata_fast models >> >> Le mercredi 25 avril 2018 18:49:22 UTC+2, shree a écrit : >> >>> Which version of tesseract are you using? >>> >>> ShreeDevi >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >>> On Wed, Apr 25, 2018 at 8:29 PM, Youcef <youcef...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> >>>> Tesseract seems to post process its prediction. >>>> >>>> Here after, what I get after OCRizing images (same font, same size >>>> images generated with text2image): >>>> >>>> - an image containing "12345678I" => `123456781` >>>> - an image containing "GLOTHUVFI" => `GLOTHUVFI` >>>> - an image containing "12345678H" => `12345678H` >>>> - an image containing "GLOTHUVFH" => `GLOTHUVFH` >>>> - an image containing "12345678A" => `123456784` >>>> - an image containing "GLOTHUVFA" => `GLOTHUVFA` >>>> >>>> It looks like Tesseract doesn't like a word with a some numbers and one >>>> letter at the end. In fact, if the letter looks like a number ("I" and "A" >>>> looks like "1" and "4" respectively), it replaces it by the closest number. >>>> I have tried to tune following parameters without any changement in the >>>> result: >>>> >>>> - segment_penalty_dict_frequent_word >>>> - language_model_penalty_chartype >>>> >>>> Thanks for any help. >>>> >>>> Regards >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-oc...@googlegroups.com. >>>> To post to this group, send email to tesser...@googlegroups.com. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/4722674d-27a1-4b8e-8c5a-9e07dbe3ca7d%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/4722674d-27a1-4b8e-8c5a-9e07dbe3ca7d%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/90ee26b4-6e63-41ec-8a7a-a2b6d23e7dc4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.