The files will be at Google. You have to wait till Ray Smith updates the
repository.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Tue, Aug 22, 2017 at 12:58 PM, <robertyoung0...@gmail.com> wrote:

> Thanks for your reply.
>
> Do you know where can I find the new langdata files?
>
> 在 2017年8月22日星期二 UTC+8下午3:22:36,shree写道:
>>
>> The langdata files have not been updated for 4.00alpha
>>
>> ShreeDevi
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
>> On Tue, Aug 22, 2017 at 12:17 PM, <roberty...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I'm trying to re-train the chi_sim.traineddata model from scratch for
>>> studying.
>>>
>>> I use the source data of chi_sim.training_text in the link directory
>>> https://github.com/tesseract-ocr/langdata/tree/master/chi_sim to train
>>> the model with the command:
>>>
>>> training/lstmtraining --debug_interval 100 \
>>> --traineddata ~/tesstutorial/trainspecial/chi_sim/chi_sim.traineddata \
>>> --net_spec '[1,48,0,1 Ct3,3,16 Mp3,3 Lfys64 Lfx96 Lrx96 Lfx512 O1c1]' \
>>> --model_output ~/tesstutorial/specialoutput/base --learning_rate 20e-4 \
>>> --train_listfile ~/tesstutorial/trainspecial/chi_sim.training_files.txt \
>>> --eval_listfile ~/tesstutorial/evalspecial/chi_sim.training_files.txt \
>>> --max_iterations 3600 &>~/tesstutorial/specialoutput/basetrain.log
>>>
>>>
>>>
>>> The net_spec is same as the official model package (chi_sim.traineddata
>>> from the tessdata github).
>>>
>>>
>>>
>>> After converting the training model with the lstmtraining
>>> --stop_training, a new chi_sim.traineddata model gererated, which is named
>>> chi_sim_new.traineddata.
>>> And I name the official chi_sim.traineddata as chi_sim.traineddata for
>>> distinguishing.
>>>
>>>
>>> Then I pull out all the characters in the two traineddata model.
>>>
>>> There are 4384 characters in the chi_sim.traineddata, but 2538
>>> characters in the chi_sim_new.traineddata which is generated by me.
>>>
>>> Why are there different characters in the two models? Does the source
>>> data in the chi_sim.training_text haven't updated in time?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>> To post to this group, send email to tesser...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/tesseract-ocr/1111e3f0-588b-456f-90bf-a878f20b1f26%40goo
>>> glegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/1111e3f0-588b-456f-90bf-a878f20b1f26%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/b96558c2-1555-41c8-bcb0-0282efeb3556%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/b96558c2-1555-41c8-bcb0-0282efeb3556%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXhBRwzXCpYNUiSkUQ2iZinhL8EfVU5hAVqEBY3UrkTAQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to