Training for all languages including RTL languages is done in LTR order. See https://github.com/tesseract-ocr/tesseract/issues/2082 and other related issues in github
On Sun, Nov 24, 2019 at 1:28 AM Ishak DÖLEK <[email protected]> wrote: > Hi; > I create a trainneddata for an Arabic font. > I prepared the ara.training_text file to create synthetic data. > I create image and box files with Text2Image. > Then I create the Lstmf files. > I start training. > During training, the text lines are sorted from left to right. Is that > normal? > > GROUND TRUTH : هجبرع ردراو ىراثآ ردراو ىقارم هعبتت ردناقشلاچ رد ىكذ > ردشمتيا تئشن ند هيبرح بتكم ىغيدلوا ىلشيريول > ALIGNED TRUTH : هجبرع ردراو ىراثآ ردراو ىقارم هعبتت ردناقشلاچ رد ىكذ > ردشمتيا تئشن ند هيبرح بتكم ىغيدلوا ىلشيريول > BEST OCR TEXT : هجبرع ردراو ىراثآ ردراو ىقارم هعبگ ردناقشلاچ رد ىكن > ردشمتيا تثشن ند هيبرح بتكم ىغيدلوا ىلشيريول > > Otherwise I need to sort each line of training text from left to right > before training? > > Thanks in advance > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/CAA%3DdkuYk%2BR5UB0ywPzKFeAzrN2u0ebz2CRV7KTPSvTLugMA34Q%40mail.gmail.com > <https://groups.google.com/d/msgid/tesseract-ocr/CAA%3DdkuYk%2BR5UB0ywPzKFeAzrN2u0ebz2CRV7KTPSvTLugMA34Q%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXAZBEFiE4P06xF6SKPMZoVwPySzSaJZdfgOLxRwqw0cw%40mail.gmail.com.

