Did you use --stop_training flag at the end?

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Mon, Jan 8, 2018 at 5:51 PM, <[email protected]> wrote:

> Hi all,
>
> I am doing my project using Tesseract v4.00, and always getting the
> traineddata output in the same size after training with my own data.
> I suppose that I did not do the steps correctly..
>
> The only data that I provided were:
> 1. training_text
> 2. puncs (I just reduced the general punc as provided in tesseract github)
> 3. numbers
> 4. wordlists (I made various wordlists for several training, ranging
> between 100.000 - 2.000.000)
> 5. font name (I also made various fonts for several training, ranging
> between 1 - 20 fonts)
>
> The steps that I did were:
> 1. Made tiff file, unicharset and other complement data using tesstrain.sh
> 2. Made tiff file, unicharset and other complement data using tesstrain.sh
> for evaluation
> 3. Combined unicharset, wordlists, puncs, numbers and version_str to
> create started traineddata using combine_lang_data ( I am still not
> confident with the value of version_str though)
> 4. Trained data using lstmtraining
> 5. Combined all output file using lstmtraining --continue_from ...
>
> Yet, all of my training ended with same size which is 10.5MB..
> Did I do all my steps correctly?
>
> Once, I also trained with modifying WORD_DAWG_FACTOR in
> language_spesific.sh to 0 and 1, because I want to read the text and match
> 100% with my wordlists. But, the result also did not satisfy me, some words
> are not in my wordlists such as "USISUSISU".
> Do you know whats the cause?
>
> I really appreciate if anyone can help or suggest any solution.
> Thankyou !!
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/b6ca74b2-1e50-44cb-93f6-586fcd26cec5%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/b6ca74b2-1e50-44cb-93f6-586fcd26cec5%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWxM%3Dbx_cKK8p9_YCD3oyhc-Cc%3DiCJQ9vbHrAi36-UnWw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to