*Thanks a lot, shree!*
terça-feira, 10 de Setembro de 2019 às 15:25:16 UTC+1, shree escreveu: > > >I get a file named output_checkpoint with 200MB. I renamed it to > ccy.traineddata and put it in the tessdata folder. *Is this how it's > supposed to do*? > > No. Please see > https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#combining-the-output-files > > >*Is there a way to check if a traineddata file is valid*? > > > https://github.com/tesseract-ocr/tesseract/blob/master/doc/combine_tessdata.1.asc > > > -d *.traineddata* *FILE*…: Lists directory of components from the > .traineddata file. > > combine_tessdata -d tessdata/eng.traineddata > > On Tue, Sep 10, 2019 at 7:40 PM Nuno Feliciano <[email protected] > <javascript:>> wrote: > >> >> Thanks for the quick reply. The first time I got the error was after the >> learning process, so I did a step backwards to replicate the error. >> >> When I train the model >> lstmtraining >> --traineddata D:/software/Tesseract-OCR-4.0/tessdata/ccy.traineddata >> -U D:/software/Tesseract-OCR/tessdate/Latin.unicharset >> --train_listfile D:/software/Tesseract-OCR/training/list.train >> --net_spec >> "[1,40,0,1 Ct5,5,64 Mp3,3 Lfys128 Lbx256 Lbx256 O1c1]" >> --model_output D:/software/Tesseract-OCR/training/model/output >> >> I get a file named output_checkpoint with 200MB. I renamed it to >> ccy.traineddata and put it in the tessdata folder. *Is this how it's >> supposed to do*? >> Then know When I execute the OCR I get >> Error opening data file >> D:\software\Tesseract-OCR-4.0\tessdata/ccy.traineddata >> Please make sure the TESSDATA_PREFIX environment variable is set to your >> "tessdata" directory. >> Failed loading language 'ccy' >> Tesseract couldn't load any languages! >> Could not initialize tesseract. >> >> The file exists, and I can open in a text editor. >> >> *Is there a way to check if a traineddata file is valid*? >> >> Thanks, >> Nuno >> >> segunda-feira, 9 de Setembro de 2019 às 17:09:39 UTC+1, shree escreveu: >>> >>> Combine-lang-model only creates the starter traineddata. It is used as >>> part of lstm training process. It cannot be used for recognition. >>> >>> Training from scratch requires running the lstmtraing command. >>> >>> On Mon, Sep 9, 2019, 21:36 Nuno Feliciano <[email protected]> wrote: >>> >>>> >>>> >>>> >>>> >>>> Hi, >>>> >>>> I am trying to make a model from scratch. >>>> I created a language using >>>> combine_lang_model --input_unicharset >>>> D:\software\Tesseract-OCR-4.0\tessdata\Latin.unicharset --script_dir >>>> D:\software\Tesseract-OCR-4.0\tessdata --output_dir >>>> D:\software\Tesseract-OCR-4.0\training\output *--lang ccy* >>>> Than I put the generated ccy.traineddata file in tessdata folder and I >>>> execute >>>> tesseract --tessdata-dir D:\software\Tesseract-OCR-4.0\tessdata -l ccy >>>> <file> stdout, which gives me >>>> *Failed loading language 'ccy'* >>>> Tesseract couldn't load any languages! >>>> Could not initialize tesseract. >>>> >>>> tesseract --list-langs gives me >>>> ccy >>>> eng >>>> osd >>>> ... >>>> >>>> I got Latin.unicharset from >>>> https://raw.githubusercontent.com/tesseract-ocr/langdata_lstm/master/Latin.unicharset >>>> >>>> Can anyone help? >>>> >>>> Thanks, >>>> Nuno Feliciano >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/f0157ef9-7b83-4fa3-8cf5-3697514d6de0%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/f0157ef9-7b83-4fa3-8cf5-3697514d6de0%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/9a4f9c1d-009a-4420-a662-26b2678e253a%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/9a4f9c1d-009a-4420-a662-26b2678e253a%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > > > -- > > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/2eb702f7-16bc-4efe-bad0-1164f7c161f4%40googlegroups.com.

