Re: [tesseract-ocr] Creation of encoded unicharset failed While constructing LSTM training data.

2017-08-10 Thread ShreeDevi Kumar
​Seems to work fine for me.

Are you sure that you have relevant files in the  directories listed in
that command?

check tessdata, langdata location.

Use tessdata/best/*.traineddata as the existing models.​

ShreeDevi

भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Thu, Aug 10, 2017 at 2:05 PM,  wrote:

> Hello,
>
> I'm trying to finetune the end.traineddata model as the steps in the link:
> https://github.com/tesseract-ocr/tesseract/wiki/
> TrainingTesseract-4.00#fine-tuning-for-%C2%B1-a-few-characters
>
> As the tutorail shows, I fine tuning for ± a few characters following the
> steps.
>
> But when I execute the first command, to generate new training and eval
> data:
>
> training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only 
> \
>   --noextract_font_properties --langdata_dir ../langdata \
>   --tessdata_dir ./tessdata --output_dir ~/tesstutorial/trainplusminus
>
>
> An error is prompted: *Creation of encoded unicharset failed! *While
> constructing LSTM training data.
>
> More details refer to the image.
>
> Can you help me? Thanks.
>
>
>
> 
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/1c40ba47-a6e5-4ec9-bf58-677bcdb2f74b%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWSqtqzPB0VP4nc%2B-en9wkYZ8dhEm-P8v%2BG_QFrzs59%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Creation of encoded unicharset failed While constructing LSTM training data.

2017-08-10 Thread robertyoung0511
Hello,

I'm trying to finetune the end.traineddata model as the steps in the link: 
https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#fine-tuning-for-%C2%B1-a-few-characters

As the tutorail shows, I fine tuning for ± a few characters following the 
steps.

But when I execute the first command, to generate new training and eval 
data:

training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only \
  --noextract_font_properties --langdata_dir ../langdata \
  --tessdata_dir ./tessdata --output_dir ~/tesstutorial/trainplusminus


An error is prompted: *Creation of encoded unicharset failed! *While 
constructing LSTM training data.

More details refer to the image.

Can you help me? Thanks.




-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/1c40ba47-a6e5-4ec9-bf58-677bcdb2f74b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.