Re: [tesseract-ocr] Failed loading language

Nuno Feliciano Tue, 10 Sep 2019 07:10:35 -0700

Thanks for the quick reply. The first time I got the error was after the 
learning process, so I did a step backwards to replicate the error.


When I train the model
lstmtraining 
--traineddata D:/software/Tesseract-OCR-4.0/tessdata/ccy.traineddata 
-U D:/software/Tesseract-OCR/tessdate/Latin.unicharset 
--train_listfile D:/software/Tesseract-OCR/training/list.train 
--net_spec
 "[1,40,0,1 Ct5,5,64 Mp3,3 Lfys128 Lbx256 Lbx256 O1c1]" 
 --model_output D:/software/Tesseract-OCR/training/model/output

 I get a file named output_checkpoint with 200MB. I renamed it to 
ccy.traineddata and put it in the tessdata folder. *Is this how it's 
supposed to do*?
Then know When I execute the OCR I get
Error opening data file 
D:\software\Tesseract-OCR-4.0\tessdata/ccy.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your 
"tessdata" directory.
Failed loading language 'ccy'
Tesseract couldn't load any languages!
Could not initialize tesseract.

The file exists, and I can open in a text editor.

*Is there a way to check if a traineddata file is valid*?

Thanks,
Nuno

segunda-feira, 9 de Setembro de 2019 às 17:09:39 UTC+1, shree escreveu:
>
> Combine-lang-model only creates the starter traineddata. It is used as 
> part of lstm training process. It cannot be used for recognition. 
>
> Training from scratch requires running the lstmtraing command.
>
> On Mon, Sep 9, 2019, 21:36 Nuno Feliciano <[email protected] 
> <javascript:>> wrote:
>
>>
>>
>>
>>
>> Hi,
>>
>> I am trying to make a model from scratch.
>> I created a language using 
>> combine_lang_model --input_unicharset 
>> D:\software\Tesseract-OCR-4.0\tessdata\Latin.unicharset --script_dir 
>> D:\software\Tesseract-OCR-4.0\tessdata --output_dir 
>> D:\software\Tesseract-OCR-4.0\training\output *--lang ccy*
>> Than I put the generated ccy.traineddata file in tessdata folder and I 
>> execute
>> tesseract --tessdata-dir D:\software\Tesseract-OCR-4.0\tessdata -l ccy 
>> <file> stdout, which gives me
>> *Failed loading language 'ccy'*
>> Tesseract couldn't load any languages!
>> Could not initialize tesseract.
>>
>> tesseract --list-langs gives me
>> ccy
>> eng
>> osd
>> ...
>>
>> I got Latin.unicharset from 
>> https://raw.githubusercontent.com/tesseract-ocr/langdata_lstm/master/Latin.unicharset
>>
>> Can anyone help?
>>
>> Thanks,
>> Nuno Feliciano
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/f0157ef9-7b83-4fa3-8cf5-3697514d6de0%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/f0157ef9-7b83-4fa3-8cf5-3697514d6de0%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/9a4f9c1d-009a-4420-a662-26b2678e253a%40googlegroups.com.

Re: [tesseract-ocr] Failed loading language

Reply via email to