To make sure that the model is not overfitted to training data, your eval
set should be different.

You can use a different text file, different fonts from the training set to
check that the model performs well on text and fonts it has not seen
earlier.

On Tue 10 Apr, 2018, 8:16 PM Fanatico, <fanatico.s...@gmail.com> wrote:

> Platform: MAC OS X
> Tesseract: 4.0.0-beta.1-69-g10f4
>
> Wen I execute a command like:
>
> SCROLLVIEW_PATH=~/projects/tesseract/java \
>   ~/projects/tesseract/training/lstmtraining \
>     --debug_interval 100 \
>     --continue_from
> ~/projects/ocr/training/kortrain/kor_from_full/kor.lstm \
>     --traineddata
> ~/projects/ocr/training/kortrain/new_train/kor/kor.traineddata \
>     --append_index 5 \
>     --net_spec '[Lfx256 O1c111]' \
>     --model_output ~/projects/ocr/training/kortrain/kor_from_full/base \
>     --train_listfile
> ~/projects/ocr/training/kortrain/new_train/kor.training_files.txt \
>     --eval_listfile
> ~/projects/ocr/training/kortrain/eval/kor.training_files.txt \
>     --target_error_rate 1
> &>~/projects/ocr/training/kortrain/kor_from_full/basetrain.log
>
> I have "--train_listfile" that tells the location of my training files
> for each font and I have "--eval_listfile" that I suppose is the location
> for the training files used to test the result of the training.
>
> So my doubt is:
> 1 - Why I'm training with the fonts "A", "B" and "C" but testing with the
> fonts "D", "E" and "F"?
> 2 - And if I need to test using the same fonts, then why do I need to pass
> 2 times the same file?
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/532b2514-ff7d-4c2c-998a-d61a2aee653a%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/532b2514-ff7d-4c2c-998a-d61a2aee653a%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduW9syRYqEWAMUSqaE%3DWY2TnRCp3BXPrnQ0pdTaAduxdNQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to