I just notice that under the Language-specific.sh, there are valid fonts 
for each language. I think i should use all the fonts for a single 
language. 

Regards,
Chen

在 2015年11月23日星期一 UTC+8上午2:36:51,Chen写道:
>
> I am trying to generate .traindata myself. I have some questions related 
> to the training procedure.
>
> We can find langdata and tessdata on github. Is there an official document 
> introducing how to convert langdata to the final .traindata? I'm not saying 
> the basic procedure here in wiki/TrainingTesseract, but the exact way to 
> reproduce the offical .traindata. I guess the release lang.traindatas are 
> generated by the Tesstrain.sh, but i cant find the script parameters like 
> used fonts for any language. For the important text2image function, there 
> are a lot of parameters, the official released can not just use one set of 
> parameters for all the languages, right? i'm not sure. Can anyone guide me 
> how to reproduce or nearly reproduce the offical .traindata? I think the 
> efforts on tuning parameters must have been made here in the training, i 
> just dont want to re-make the wheels again. BTW, the reason i want to 
> generate the traindata myself is that i just want to recognize a subset of 
> the whole language characters thus training a light package can greatly 
> reducing the recognition time. Thanks in advance.
>
> Regard,
> Chen 
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/75949e8e-12bf-4c62-8a81-c81467d8023e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to