Thanks for the quick reply. The first time I got the error was after the learning process, so I did a step backwards to replicate the error.
When I train the model lstmtraining --traineddata D:/software/Tesseract-OCR-4.0/tessdata/ccy.traineddata -U D:/software/Tesseract-OCR/tessdate/Latin.unicharset --train_listfile D:/software/Tesseract-OCR/training/list.train --net_spec "[1,40,0,1 Ct5,5,64 Mp3,3 Lfys128 Lbx256 Lbx256 O1c1]" --model_output D:/software/Tesseract-OCR/training/model/output I get a file named output_checkpoint with 200MB. I renamed it to ccy.traineddata and put it in the tessdata folder. *Is this how it's supposed to do*? Then know When I execute the OCR I get Error opening data file D:\software\Tesseract-OCR-4.0\tessdata/ccy.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language 'ccy' Tesseract couldn't load any languages! Could not initialize tesseract. The file exists, and I can open in a text editor. *Is there a way to check if a traineddata file is valid*? Thanks, Nuno segunda-feira, 9 de Setembro de 2019 às 17:09:39 UTC+1, shree escreveu: > > Combine-lang-model only creates the starter traineddata. It is used as > part of lstm training process. It cannot be used for recognition. > > Training from scratch requires running the lstmtraing command. > > On Mon, Sep 9, 2019, 21:36 Nuno Feliciano <[email protected] > <javascript:>> wrote: > >> >> >> >> >> Hi, >> >> I am trying to make a model from scratch. >> I created a language using >> combine_lang_model --input_unicharset >> D:\software\Tesseract-OCR-4.0\tessdata\Latin.unicharset --script_dir >> D:\software\Tesseract-OCR-4.0\tessdata --output_dir >> D:\software\Tesseract-OCR-4.0\training\output *--lang ccy* >> Than I put the generated ccy.traineddata file in tessdata folder and I >> execute >> tesseract --tessdata-dir D:\software\Tesseract-OCR-4.0\tessdata -l ccy >> <file> stdout, which gives me >> *Failed loading language 'ccy'* >> Tesseract couldn't load any languages! >> Could not initialize tesseract. >> >> tesseract --list-langs gives me >> ccy >> eng >> osd >> ... >> >> I got Latin.unicharset from >> https://raw.githubusercontent.com/tesseract-ocr/langdata_lstm/master/Latin.unicharset >> >> Can anyone help? >> >> Thanks, >> Nuno Feliciano >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/f0157ef9-7b83-4fa3-8cf5-3697514d6de0%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/f0157ef9-7b83-4fa3-8cf5-3697514d6de0%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/9a4f9c1d-009a-4420-a662-26b2678e253a%40googlegroups.com.

