Hey everyone ,
I am train my own lstm model based using some specific images that I want
tesseract to work efficiently on. I have used the command
*$ lstmtraining --model_output=my_output.lstm --traineddata="C:\Program
Files\Tesseract-OCR\tessdata\eng.traineddata" --old_traineddata="C:\Program
I tried to train Tesseract 5 with a new font in Thai but The BCER value
keeps increasing. This is the detail
Font : TH Sarabun New (200 samples)
Base Model: tha.traineddata (I download it from tessdata_best)
(base) Unknown tesstrain % TESSDATA_PREFIX=../tesseract/tessdata
I've trained thousands of images. But the traineddata file size didn't
change at all.
Did I do something wrong?
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
Windows, msvc 2022, win32, I've got some questions regarding compilation
1) How to specify the directory where Leptonica is installed? No matter
what I tried sln file every time contains *c:\Program Files(x86)\Leptonica*
2) Leptonica is definitely compiled with libtiff support:
*-- Used TIFF
One correction:
I checked the example in the below mentioned url with the Tesseract
executable and tessdata repository. The result is that user_pattern is
effecting also LSTM. This could be easily tested by generating output
without user_patters (Arial.txt):
tesseract Arial.png Arial
And with
5 matches
Mail list logo