Hello Lorenzo, We're fine tuning en.traineddata without modifications with charset restriction within [A-Z0-9]. We're using the default parameters and the model converges very fast. We have #1376 images from Google image used to test the accuracy. The reported accuracy is min(detector, recognizer). These #1376 images can't be directly used with tesseract and requires a detector and preprocessor.
On Wednesday, May 29, 2019 at 10:08:53 AM UTC+2, Lorenzo Blz wrote: > > Hi Mamadou, > this sounds very interesting. How did you do the training and accuracy > measurements? What parameters did you use for the model? > > > Thanks, bye > > Lorenzo > > Il giorno lun 27 mag 2019 alle ore 07:38 Mamadou <[email protected] > <javascript:>> ha scritto: > >> Hello, >> >> We have open sourced (BSD license) MRZ/MRP (Machine-readable >> zone/passport) dataset and models for Tesseract v4. >> The dataset contains more than #7 thousands images (.tif) with ground >> truth (.gt.txt) from Google image augmented with few synthetic data. >> It's ready to be used to train with Tesseract v4. >> If you're lazy and don't want to train the models by yourself then, try >> the ones under tessdata_best (float-model) or tessdata_fast (int-model) >> folders. >> >> Accuracy: 99.7% >> Source code: https://github.com/DoubangoTelecom/tesseractMRZ >> >> Regards, >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/a92ec47e-5055-4ffe-a174-f437d3c7ccf2%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/a92ec47e-5055-4ffe-a174-f437d3c7ccf2%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7c2933ca-3c34-42a4-93c7-9a33a09341dd%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

