Please read the wiki page regarding training 4.0 and the presentation files
in docs by Ray Smith.

On Tue, 30 Oct 2018, 02:32 bruce, <[email protected]> wrote:

> thank you for your reply ,shree.
> I've seen the training_text and the list of fonts.
> I will try again.
> Before I start my next  Scratch training,I want to ask some questions as
> follows.
>
> 1.Is the training_text containing more characters, the better the training
> results? Is there an upper limit?
>
> 2.Whether the more fonts are used, the better the training results will be?
>
> 3.I find that the official text contains not only Chinese characters, but
> also English characters and numbers.
>    If I will use the command like this:  tesseract.exe  test.png
> c:\dir\test -l eng+chi_sim
>    Is it better for me to train  a training_text with pure Chinese
> characters?
>
>
> 在 2018年10月30日星期二 UTC+8上午2:43:05,shree写道:
>>
>> https://github.com/tesseract-ocr/langdata_lstm/tree/master/chi_sim
>>
>> On Mon, 29 Oct 2018, 14:41 Shree Devi Kumar, <[email protected]> wrote:
>>
>>> Please look at the langdata_lstm repo, specifically the chi_sim folder.
>>> It has the training_text as well as list of fonts used for LSTM training.
>>>
>>> On Mon, 29 Oct 2018, 05:40 bruce, <[email protected]> wrote:
>>>
>>>> Recently,I'm using tesseract training my chi_sim language. I want to
>>>> train a chi_sim.traineddata better than the official one.
>>>> I have generated a 82915-characters training data.And trained it with 7
>>>> common fonts。
>>>> After 4434207 iterations ,the train rate is lower than 0.016% ,But the
>>>> recognition effect is much worse than the official training library.
>>>>
>>>> so,I'm confused...
>>>>
>>>> How to improve the quality of Training?
>>>> Do I need more training data for more training fonts?What is the right
>>>> amount?
>>>> I want to know the training data of the official training library and
>>>> the font range of the official training library.
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/tesseract-ocr/a7acc320-67f6-42b3-b2c8-99d3db6de7e6%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/a7acc320-67f6-42b3-b2c8-99d3db6de7e6%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/501bdf42-ee5a-4a2e-92ce-8dbac2cc42be%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/501bdf42-ee5a-4a2e-92ce-8dbac2cc42be%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWC65_Z2cDV2%3DS-4cDjQmhuq-te%2ByJBB35mZ0aaNxas0Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to