That error is because some characters in your training text are not part of
the unicharset of chi_sim.

You are trying finetune training which will give error. Replace top layer
will work.

I suggest that you wait 2-3 weeks for Ray to upload new traineddata for all
languages.

You can tell us if there are any specific characters missing from existing
traineddata .

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Tue, Jul 25, 2017 at 12:46 PM, <[email protected]> wrote:

> Hello,
>
> I apply the command to train my own traineddata:
>
> lstmtraining --model_output ~/tesstutorial/chituned_from_chisim/chituned \
>   --continue_from ~/tesstutorial/chituned_from_chisim/chi_sim.lstm \
>   --train_listfile ~/tesstutorial/chitest/chi.training_files.txt \
>   --eval_listfile ~/tesstutorial/chitest/chi.training_files.txt \
>   --target_error_rate 0.01
>
> An error appears by Tess4.0 that shown in the following img. The system 
> (Tess4.0) says "Can't encode transcript" for text content such as 
> "化简(-x2)3的结果是...".
> Why? Can you help me?
>
>
> <https://lh3.googleusercontent.com/-f5tjdv3_nvk/WXbvefZQYrI/AAAAAAAAAAM/COSWa-ewxy46XNkFxUCUl5V2r4K2ZfiQACLcBGAs/s1600/_%2524_WUP8_FXB%2560DR9_I5A8Y%2560L.png>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/e2e1d749-a55d-4355-b128-5d0fe2181e19%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/e2e1d749-a55d-4355-b128-5d0fe2181e19%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduWjrZ0yNfP%2BTcnKyzn9HO3LxBDsSdU%2BeqVg%2BSD_eacUUQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to