I am using tessdata_best. There are no problems if text image contains a single language. Tesseract works fine on them.
*Note:* The main problem occurs when same line in the image contains 2 language. On Fri, Aug 30, 2019 at 2:04 PM Shree Devi Kumar <[email protected]> wrote: > Which traineddata are you using ? from tessdata, tessdata_best or > tessdata_fast? > > Are accuracy problems only with mixed image or even for urdu only? > > On Fri, Aug 30, 2019 at 2:00 PM Shubham Gupta <[email protected]> > wrote: > >> Tried that combination already. >> I used following combinations: >> 1) urd >> 2)eng >> 3) urd+eng >> 4) eng+urd >> >> No combination used above gave me meaningful output. >> >> Can someone suggest me any new approach on this? Can I create a Hybrid >> model by creating Training data consisting of both urdu and english and >> train tesseract on that data? Is it a good approach? >> >> Thanks >> Shubham >> >> On Fri, Aug 30, 2019 at 12:25 PM Shree Devi Kumar <[email protected]> >> wrote: >> >>> Try urd+eng to give precedence to Urdu. >>> >>> Also see open issue >>> https://github.com/tesseract-ocr/tesseract/issues/2626 >>> >>> On Fri, Aug 30, 2019 at 11:26 AM Shubham Gupta <[email protected]> >>> wrote: >>> >>>> Hi All >>>> >>>> I have one query i.e. if my Image contains both Urdu and English text, >>>> I used -l parameter as eng+urd, but my output is all messed up and is not >>>> correct. Can anyone help me fix this or someone who is facing the same >>>> problem? >>>> >>>> I have attached the image below. >>>> >>>> >>>> Thanks and Regards >>>> Shubham >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/CAOYxz4rt61etBF%2BXgdzqRLDFs72h_KJ4mj1yYCt5dSbOrGusCw%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/CAOYxz4rt61etBF%2BXgdzqRLDFs72h_KJ4mj1yYCt5dSbOrGusCw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> >>> >>> -- >>> >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUmhmh%3D%3D0zJbP_b1gRF1p%2BEGWM8fTg-C3ZVjcHpOJx7Ww%40mail.gmail.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUmhmh%3D%3D0zJbP_b1gRF1p%2BEGWM8fTg-C3ZVjcHpOJx7Ww%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/CAOYxz4r45_LehMc%3DoFsF8Y1Nr-VthyEA8ig_vuc%3DASG9-DdeeQ%40mail.gmail.com >> <https://groups.google.com/d/msgid/tesseract-ocr/CAOYxz4r45_LehMc%3DoFsF8Y1Nr-VthyEA8ig_vuc%3DASG9-DdeeQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > > > -- > > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWskJwp8A4JhW49YoOFw8hM0nJscT4zU6_Dw7s2GHhOow%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWskJwp8A4JhW49YoOFw8hM0nJscT4zU6_Dw7s2GHhOow%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAOYxz4qh%3DhwDQhem%2B7hdGr_ykXiUcsPEatbJFcobT6zna0DZEw%40mail.gmail.com.

