[tesseract-ocr] Does the psm value used to generate lstmf files influences the training?

2019-03-21 Thread Lorenzo Bolzani
Hi, I keep having problems with duplicated letters with custom fine-tuned models. For example an M becomes MH. I'm using ocrd-train with actual crops and I noticed that the lstmf files are generated with psm 6. At runtime I use psm 7. Do you think this may make a difference? From a quick test

Re: [tesseract-ocr] Error by using own model

2019-03-21 Thread Shree Devi Kumar
A checkpoint is NOT a traineddata file. Use -stop-training to build the traineddata. eg. echo " stop training " ~/tesseract/bin/src/training/lstmtraining \ --stop_training \ --continue_from ./devaplus_z1/plus_checkpoint \ --traineddata

[tesseract-ocr] Re: Finetuning in ocrd-train

2019-03-21 Thread Jens Humrich
The parameter can be left out of the command. It does not appear to change the result. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to

[tesseract-ocr] Error by using own model

2019-03-21 Thread Jens Humrich
Hey everyone, after training my model and gaining a sufficient accuracy, I copied the checkpoint into the TESSDATA_PREFIX folder and renamed it to jen.traineddata: cp jens_checkpoint /usr/share/tesseract/4/tessdata/jen.traineddata When trying to run tesseract with the new language I get the