Re: [tesseract-ocr] Traineddata always ended in same size and did not match with wordlist

ShreeDevi Kumar Wed, 10 Jan 2018 03:17:15 -0800

On Wed, Jan 10, 2018 at 3:56 PM, <[email protected]> wrote:

> It works !!
> I modified your bash script and executed it. Finally I get different
> traineddata size.
>
> But, can I train it from scratch?
> It needs starting traineddata which I can get from combine_lang_model,
> isn't it?
>
>
Starter traineddata will be generated by tesstrain.sh, change the files in
langdata folder.


To train from scratch, you need to change the lstmtraining command. It
will not need continue_from and old_traineddata.

You will need to add a network specification - such as

 --net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]' \

Usually the best traineddata will have the network spec used for training
by Ray as part of the version string.

See https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00
for more details.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVhehmj_FRBVToQy28guj_0Eu7dCEsheXa8dJkuhrV7Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [tesseract-ocr] Traineddata always ended in same size and did not match with wordlist

Reply via email to