see
https://github.com/tesseract-ocr/tesseract/wiki/Training-Tesseract-3.03%E2%80%933.05

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Sat, Apr 7, 2018 at 4:02 PM, Romil Mehla <meh...@gmail.com> wrote:

> Thanks for your reply , i have read about tesseract 4.0 and Ray mentioned
> how he used so many files to train tesseract 4.0 but i dont want to use
> tesseract 4.0 , i wanted to know about tesseract 3.05.00 , from my
> understanding suppose for eng languaur . eng.training_text file is build
> from eng.wordlist  file mentioned in langdata. For a new language how can i
> build training text from my new languaue wordlist ,any idea on who has
> created the eng.training_text  file ? is there any rule or algorithm to do
> so , or it is randomly generated from eng.wordlist by maintaining minimum
> 10 times occurrence of a character in training text.
>
>
>
> Please clarify on this , please let me know how to generate traning_text??
>
> On Saturday, April 7, 2018 at 3:46:10 PM UTC+5:30, shree wrote:
>>
>> Just a word list is not enough for training text.
>>
>> For tesseract 4.0.0 it needs to be representative of the text to be
>> recognized.
>>
>> On Sat 7 Apr, 2018, 2:50 PM Romil Mehla, <meh...@gmail.com> wrote:
>>
>>> Is there any program to generate it ?  i see ambiguous_words.cpp
>>> generating dictionary words and ambiguous words where is it used ? or it
>>> can be used to build unicharambigs file to generate rules ?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>> To post to this group, send email to tesser...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/tesseract-ocr/2ce880b4-b750-4be9-a1a0-01f832f679df%40goo
>>> glegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/2ce880b4-b750-4be9-a1a0-01f832f679df%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/fcfdc967-121e-480a-a0fe-e57f341115c7%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/fcfdc967-121e-480a-a0fe-e57f341115c7%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWcHvQfqitW37fh-tVk9GsfZq9Byc%3Dmv_cGM2Uipwp%2B5w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to