Did you copy the traineddata file to /usr/share/tesseract-ocr/4.00/tessdata? What's the value of TESSDATA_PREFIX in your 'env' output?
What's the output of? ls -l /usr/share/tesseract-ocr/4.00/tessdata/Sanskrit-1017-fast.traineddata combine_tessdata -d /usr/share/tesseract-ocr/4.00/tessdata/Sanskrit-1017-fast.traineddata tesseract --list-langs --tessdata-dir /usr/share/tesseract-ocr/4.00/tessdata tesseract --list-langs tesseract -v On Wednesday, October 28, 2020 at 3:04:01 AM UTC+5:30 Timo Struppi wrote: > Help! I get following errorcode. What am i doing wrong? > > Error opening data file > /usr/share/tesseract-ocr/4.00/tessdata/Sanskrit-1017-fast.traineddata > Please make sure the TESSDATA_PREFIX environment variable is set to your > "tessdata" directory. > Failed loading language 'Sanskrit-1017-fast' > Tesseract couldn't load any languages! > Could not initialize tesseract. > > On Saturday, October 24, 2020 at 5:53:55 PM UTC+2 Timo Struppi wrote: > >> *perfect!* Thank you very much <3 Thats what i was looking for. >> International Alphabet of Sanskrit Transliteration Characters. >> >> Can tell me in which folder i must place the .traineddata? >> >> My configuration: >> tesseract 4.1.1 >> leptonica-1.79.0 >> libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : >> libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1 >> Found AVX >> Found SSE >> Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 >> liblz4/1.9.2 libzstd/1.4.4 >> >> Many thanks again for your fast help >> >> On Saturday, October 24, 2020 at 3:12:15 PM UTC+2 shree wrote: >> >>> Ray has suggested using plus-minus type of training for adding a couple >>> of characters to the traineddata. Did you try that? >>> >>> Please share the training data you used (box/tiff pairs or lstmf files). >>> >>> I have done replace a layer training for Sanskrit. It adds the two >>> characters you want (in addition to many other required for Sanskrit >>> transliteration) . See sample image and attached output. The file is >>> available at >>> https://github.com/Shreeshrii/tess5training-sanskrit-iast/tree/main/tessdata/fast >>> >>> >>> >>> On Sat, Oct 24, 2020 at 5:31 PM Timo Struppi <[email protected]> wrote: >>> >>>> >>>> Hello, >>>> >>>> I dont want to invent the wheel new by creating a new language but how >>>> do i add the letters ṛ and ī to the OCR?? >>>> >>>> I tried a lot (vietOCR, Linux inteligent OCR solution, followed the few >>>> avaible tutorials etc) for several days but i am still not achieve to add >>>> a >>>> single letter. >>>> >>>> >>>> Many thanks in advance >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/f23a9be3-dea4-46a6-8e21-dbe9c120d993n%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/f23a9be3-dea4-46a6-8e21-dbe9c120d993n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> >>> >>> -- >>> >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/16ae9c7d-74e9-4d76-b998-e004d3540312n%40googlegroups.com.

