*Thanks a lot, shree!*

terça-feira, 10 de Setembro de 2019 às 15:25:16 UTC+1, shree escreveu:
>
> >I get a file named output_checkpoint with 200MB. I renamed it to 
> ccy.traineddata and put it in the tessdata folder. *Is this how it's 
> supposed to do*?
>
> No. Please see 
> https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#combining-the-output-files
>
> >*Is there a way to check if a traineddata file is valid*?
>
>
> https://github.com/tesseract-ocr/tesseract/blob/master/doc/combine_tessdata.1.asc
>  
>
> -d *.traineddata* *FILE*…: Lists directory of components from the 
> .traineddata file.  
>
> combine_tessdata -d tessdata/eng.traineddata 
>
> On Tue, Sep 10, 2019 at 7:40 PM Nuno Feliciano <[email protected] 
> <javascript:>> wrote:
>
>>
>> Thanks for the quick reply. The first time I got the error was after the 
>> learning process, so I did a step backwards to replicate the error.
>>
>> When I train the model
>> lstmtraining 
>> --traineddata D:/software/Tesseract-OCR-4.0/tessdata/ccy.traineddata 
>> -U D:/software/Tesseract-OCR/tessdate/Latin.unicharset 
>> --train_listfile D:/software/Tesseract-OCR/training/list.train 
>> --net_spec
>>  "[1,40,0,1 Ct5,5,64 Mp3,3 Lfys128 Lbx256 Lbx256 O1c1]" 
>>  --model_output D:/software/Tesseract-OCR/training/model/output
>>
>>  I get a file named output_checkpoint with 200MB. I renamed it to 
>> ccy.traineddata and put it in the tessdata folder. *Is this how it's 
>> supposed to do*?
>> Then know When I execute the OCR I get
>> Error opening data file 
>> D:\software\Tesseract-OCR-4.0\tessdata/ccy.traineddata
>> Please make sure the TESSDATA_PREFIX environment variable is set to your 
>> "tessdata" directory.
>> Failed loading language 'ccy'
>> Tesseract couldn't load any languages!
>> Could not initialize tesseract.
>>
>> The file exists, and I can open in a text editor.
>>
>> *Is there a way to check if a traineddata file is valid*?
>>
>> Thanks,
>> Nuno
>>
>> segunda-feira, 9 de Setembro de 2019 às 17:09:39 UTC+1, shree escreveu:
>>>
>>> Combine-lang-model only creates the starter traineddata. It is used as 
>>> part of lstm training process. It cannot be used for recognition. 
>>>
>>> Training from scratch requires running the lstmtraing command.
>>>
>>> On Mon, Sep 9, 2019, 21:36 Nuno Feliciano <[email protected]> wrote:
>>>
>>>>
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>> I am trying to make a model from scratch.
>>>> I created a language using 
>>>> combine_lang_model --input_unicharset 
>>>> D:\software\Tesseract-OCR-4.0\tessdata\Latin.unicharset --script_dir 
>>>> D:\software\Tesseract-OCR-4.0\tessdata --output_dir 
>>>> D:\software\Tesseract-OCR-4.0\training\output *--lang ccy*
>>>> Than I put the generated ccy.traineddata file in tessdata folder and I 
>>>> execute
>>>> tesseract --tessdata-dir D:\software\Tesseract-OCR-4.0\tessdata -l ccy 
>>>> <file> stdout, which gives me
>>>> *Failed loading language 'ccy'*
>>>> Tesseract couldn't load any languages!
>>>> Could not initialize tesseract.
>>>>
>>>> tesseract --list-langs gives me
>>>> ccy
>>>> eng
>>>> osd
>>>> ...
>>>>
>>>> I got Latin.unicharset from 
>>>> https://raw.githubusercontent.com/tesseract-ocr/langdata_lstm/master/Latin.unicharset
>>>>
>>>> Can anyone help?
>>>>
>>>> Thanks,
>>>> Nuno Feliciano
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/f0157ef9-7b83-4fa3-8cf5-3697514d6de0%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/f0157ef9-7b83-4fa3-8cf5-3697514d6de0%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/9a4f9c1d-009a-4420-a662-26b2678e253a%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/9a4f9c1d-009a-4420-a662-26b2678e253a%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> -- 
>
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2eb702f7-16bc-4efe-bad0-1164f7c161f4%40googlegroups.com.

Reply via email to