Re: [tesseract-ocr] How to train by tesseract 4.00

2018-06-03 Thread ShreeDevi Kumar
If you want to train using fonts, use tesstrain.sh. See the wiki pages
regarding training.

If you want to use scanned images, then see
https://github.com/OCR-D/ocrd-train for using line images and their ground
truth transcriptions to create box files, lstmf files and training.

ShreeDevi

भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Sun, Jun 3, 2018 at 3:59 PM,  wrote:

> I have read that on the version of 4.00, the box file can be used  only
> need to cover a textline instead of individual characters.
>
> So I make a box file like this
>
> 若存在,试求出实数λ的值; 0 0 256 48 0
>
> Then I want to ask how to train it.
>
> Or is it the same version 3?   【tesseract chi_my.font.exp0.tif
> chi_my.font.exp0 nobatch box.train】
>
> or there is other better method.
>
> Thanks!
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/f65b5c86-e921-455d-9076-c2ff230dac5b%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXr6VgxG4CmS75crmTZ%2BYHW%3DKQTwvcAV0ixRsRd3h7zkg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] Unicharset_extractor meet ICU ERROR

2018-06-03 Thread yang3781590
Environment
   
   - Tesseract Version: <4.00>
   - Platform: 

Current Behavior:

C:\Users\Jerry\Desktop\新建文件夹>unicharset_extractor chi_my.font.exp0.box
Extracting unicharset from box file chi_my.font.exp0.box
ICU ERROR: U_FILE_ACCESS_ERROR


But I find this will be solved by use [tesseract-ocr-setup-4.00.00dev.exe] 
. It will occur by use [tesseract-ocr-w64-setup-v4.0.0-beta.1.20180414.exe]

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/4bb06802-12ea-4f1c-9465-dce77c4dc144%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[tesseract-ocr] How to train by tesseract 4.00

2018-06-03 Thread yang3781590
I have read that on the version of 4.00, the box file can be used  only 
need to cover a textline instead of individual characters.

So I make a box file like this 

若存在,试求出实数λ的值; 0 0 256 48 0

Then I want to ask how to train it.

Or is it the same version 3?   【tesseract chi_my.font.exp0.tif 
chi_my.font.exp0 nobatch box.train】

or there is other better method.

Thanks! 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/f65b5c86-e921-455d-9076-c2ff230dac5b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] error in lstm training

2018-06-03 Thread nick
hi shree

thanks for your reply. i will check it as soon as possible.


On Saturday, June 2, 2018 at 3:56:39 PM UTC+4:30, shree wrote:
>
> > !int_mode_:Error:Assert failed:in file weightmatrix.cpp, line 244 
>
> You can only continue_from models in tessdata_best repo which are float 
> models. The integer models in tessdata and tessdata_fast can not be used 
> for that purpose.
>
> ShreeDevi
> 
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/3af6a27d-50a7-49b8-8562-384b66a16f5e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.