Language codes recognized for tesseract training are listed in https://github.com/tesseract-ocr/tesseract/blob/master/src/training/language-specific.sh#L21
I will suggest that you use a language similar to your ancient language and do training. You can rename file with your proper language code at end. On Fri, Mar 6, 2020 at 10:07 AM aby tesh <[email protected]> wrote: > Hey, > > I have been trying to train tesseract4 for an ancient language but it > seems it can not recognize its code 'xsa' which is Sabaean Language > > > [user@laptop tesstraining]$ tesstrain.sh --fonts_dir ./sabaean_fonts > --lang xsa --linedata_only --noextract_font_properties --langdata_dir > ./tesslang --tessdata_dir ./tessdata --output_dir .xsatrain > Creating new directory .xsatrain > > === Starting training for language 'xsa' > ERROR: Error: xsa is not a valid language code > > > > Is it a common problem? Or does it need some update to recognize the > language? > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/96a4bac7-ee2c-46ab-95f9-a0313099d778%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/96a4bac7-ee2c-46ab-95f9-a0313099d778%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXUshpgdeOVq8dUoPv2_4FQ0zdsNZOSWbVJKdSSjqX5xA%40mail.gmail.com.

