Hi,
I keep having problems with duplicated letters with custom fine-tuned
models.
For example an M becomes MH.
I'm using ocrd-train with actual crops and I noticed that the lstmf files
are generated with psm 6.
At runtime I use psm 7. Do you think this may make a difference? From a
quick test
A checkpoint is NOT a traineddata file.
Use -stop-training to build the traineddata.
eg.
echo " stop training "
~/tesseract/bin/src/training/lstmtraining \
--stop_training \
--continue_from ./devaplus_z1/plus_checkpoint \
--traineddata
The parameter can be left out of the command. It does not appear to change
the result.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
Hey everyone,
after training my model and gaining a sufficient accuracy, I copied the
checkpoint into the TESSDATA_PREFIX folder and renamed it to
jen.traineddata:
cp jens_checkpoint /usr/share/tesseract/4/tessdata/jen.traineddata
When trying to run tesseract with the new language I get the
4 matches
Mail list logo