Re: [tesseract-ocr] "Can't encode transcript" error when using "lstmtraining" command with Tess4.0

2017-08-04 Thread robertyoung0511
Hi, Shree, I have also tried the new traineddata to recognize the simplified Chinese with the Linux system (ubuntu), and it works. but it seems that the new traineddata dosen't support in the windows. For the new traineddata in the ubuntu, there is also some special symbols cannot be

Re: [tesseract-ocr] "Can't encode transcript" error when using "lstmtraining" command with Tess4.0

2017-08-04 Thread robertyoung0511
I have tried the new traineddata with the Linux system (ubuntu). It works, but it seems that the new traineddata dosen't support in the windows. 在 2017年8月1日星期二 UTC+8下午6:03:13,roberty...@gmail.com写道: > > When I use the new traineddata, it will *report **an >

Re: [tesseract-ocr] "Can't encode transcript" error when using "lstmtraining" command with Tess4.0

2017-08-01 Thread robertyoung0511
When I use the new traineddata, it will *report **an **error : cannot find the chi_sim.traineddata. Does the new traineddata only support the Tess4.0 alpa release? I use the newest code release.* 在 2017年8月1日星期二 UTC+8下午4:45:07,shree写道: > > Ray has uploaded new traineddata files in >

Re: [tesseract-ocr] "Can't encode transcript" error when using "lstmtraining" command with Tess4.0

2017-08-01 Thread robertyoung0511
OK,I will have a try. Thanks 在 2017年8月1日星期二 UTC+8下午4:45:07,shree写道: > > Ray has uploaded new traineddata files in > https://github.com/tesseract-ocr/tessdata/tree/master/best > > Why don't you first try recognition with that > > ShreeDevi >

Re: [tesseract-ocr] "Can't encode transcript" error when using "lstmtraining" command with Tess4.0

2017-08-01 Thread ShreeDevi Kumar
Ray has uploaded new traineddata files in https://github.com/tesseract-ocr/tessdata/tree/master/best Why don't you first try recognition with that ShreeDevi भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Tue, Aug 1, 2017 at

Re: [tesseract-ocr] "Can't encode transcript" error when using "lstmtraining" command with Tess4.0

2017-08-01 Thread robertyoung0511
Hello, Shree: I'm sorry, but whether can I use more than one unicharset, such as chi_sim and eng and so on, to finetune the training? Maybe some special characters can be in other unicharsets. If I find it/them, maybe I will train my traineddata with more unicharsets, and the special

Re: [tesseract-ocr] "Can't encode transcript" error when using "lstmtraining" command with Tess4.0

2017-07-25 Thread robertyoung0511
Thanks for helpness. I will finetune with new traineddata for all languages after 2-3 weeks, and give feedback to evaluate the specific characters. 在 2017年7月25日星期二 UTC+8下午3:23:08,shree写道: > > That error is because some characters in your training text are not part > of the unicharset of

Re: [tesseract-ocr] "Can't encode transcript" error when using "lstmtraining" command with Tess4.0

2017-07-25 Thread ShreeDevi Kumar
That error is because some characters in your training text are not part of the unicharset of chi_sim. You are trying finetune training which will give error. Replace top layer will work. I suggest that you wait 2-3 weeks for Ray to upload new traineddata for all languages. You can tell us if

[tesseract-ocr] "Can't encode transcript" error when using "lstmtraining" command with Tess4.0

2017-07-25 Thread robertyoung0511
Hello, I apply the command to train my own traineddata: lstmtraining --model_output ~/tesstutorial/chituned_from_chisim/chituned \ --continue_from ~/tesstutorial/chituned_from_chisim/chi_sim.lstm \ --train_listfile ~/tesstutorial/chitest/chi.training_files.txt \ --eval_listfile