[tesseract-ocr] Regarding Tesseract OCR Training Data supporting Vietnamese

Tuan Nguyen Huy Mon, 08 Dec 2014 04:29:19 -0800

Dear all,

I'm a freelance software developer from Vietnam.
Currently I am working on improving the training data of Tesseract OCR for 
Vietnamese language.
I am having some troubles with training new data for Vietnamese languages 
as below:


1. Could someone share with me the process as well as the tools that Google 
used to make .tif/.box files?
And the guidelines of how to use the tools if possible.

2. Did Google add Vietnamese fonts to the current training data for 
Vietnamese? 
If yes, could someone let me know how to check which fonts were used?

3. Could someone share with me some .tif/.box files that Google made and 
included in the current training data for Vietnamese ? 
I would like to know what the standards for those .tif/.box files are (font 
size, image resolution, etc.)

Thank you very much for spending your time to answers my questions.

Best regards.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/17fd7bce-0b24-4793-972c-a149229a899b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Regarding Tesseract OCR Training Data supporting Vietnamese

Reply via email to