You cannot use a random unicharset, it needs to be the same one used for
training the model.
For multiple exposures, use the following method
training/tesstrain.sh \
--fonts_dir /mnt/c/Windows/Fonts \
--lang eng \
--noextract_font_properties --linedata_only \
--exposures "-1, 0, 1" \
--langdata_dir ../langdata \
--tessdata_dir ../tessdata \
--fontlist \
"Arial" \
"Tahoma" \
"Times New Roman," \
"Sanskrit 2003," \
"FreeSerif Italic" \
"Times New Roman, Italic" \
--output_dir ../tesstutorial/eng
ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
On Sat, Sep 23, 2017 at 8:46 PM, Dan9er <[email protected]> wrote:
> I'm making a unicharset file so I can compile DAWG dictionary files so I
> can use it with tesstrain.sh. I want to use multiple exposures (-1, 0,1)
> for the tiff/box pairs. How should name them to separate the
> different exposures?
>
> Can I do this?:
>
> lang.Arial.exp0
> lang.Arial.exp1
> lang.Arial.exp2
>
> Or will changing the file numbers screw things up? As an alternative, can
> I do this?:
>
> lang.Arial0.exp0
> lang.Arial1.exp0
> lang.Arial2.exp0
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/6e9f4a45-5dde-41f6-8a41-a403778aef54%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/6e9f4a45-5dde-41f6-8a41-a403778aef54%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXkO6y7oFUJEdVkK8MBNcQav%3DOWVvj9ZB_3jdYsjrVNpA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.