Re: [tesseract-ocr] Re: Training Tesseract 5.0.0 to recognize digital handwriting

'Fabio Lugli' via tesseract-ocr Thu, 16 Jan 2020 04:14:50 -0800

I still get the error, but I understood it being how I write the *all-lstmf* 
file, 
from which lstmtraining can't get the images. Right now i write into it:


*[FULL PATH TO MY FILE]/eng.test.pro0.lstmf*
*[FULL PATH TO MY FILE]/eng.test.pro1.lstmf*
*[FULL PATH TO MY FILE]/eng.test.pro2.lstmf*
ecc.

Am i correct saying that this is not what i should have inside *all-lstmf*? 

Il giorno giovedì 16 gennaio 2020 12:04:50 UTC+1, shree ha scritto:
>
> tesseract unpack is a new feature by @stweil - not yet in the master 
> branch. I was testing to see that your lstmf files are read correctly and 
> they are.
>
> For tesstrain, all you need are single line images and their gt.txt.
>
> I ram lstmtraining using your lstmf files, which worked fine. 
>
> If you want to test, try the following in a directory where you have the 
> two sample lstmf files.
> Change  ~/tessdata_best to wherever you have the best traineddata file.
>
> ls -1 *.lstmf > all-lstmf
> mkdir -p ./testdir
> combine_tessdata -e ~/tessdata_best/eng.traineddata   ./testdir/eng.lstm
>
> time lstmtraining \
>    --debug_interval  -1 \
>    --model_output ./testdir/impact \
>    --continue_from ./testdir/eng.lstm \
>    --train_listfile all-lstmf \
>    --traineddata ~/tessdata_best/eng.traineddata \
>    --max_iterations 400
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/23f21e6d-59c8-4422-b9c1-f4e960107e5e%40googlegroups.com.

Re: [tesseract-ocr] Re: Training Tesseract 5.0.0 to recognize digital handwriting

Reply via email to