Re: [tesseract-ocr] Re: Dot Matrix Fonts and Tesseract's Connected Component Analysis

Shree Devi Kumar Sat, 23 Mar 2019 03:52:19 -0700

>
> That's interesting that you tried replacing the top layer.  I haven't
> tried that yet.  How many iterations did you use?
>
>>
In this case the unicharset was limited to UPPERCASE letters, 0-9 numbers ,
: and /.
I used a training_text which followed the pattern of the image - lines
starting with LOT# and EXP: and using similar pattern.
I used 2 fonts which were very similar to the image.
So this was narrowly focussed on single use and only 2000 iterations were
needed with tessdata_best/eng to get error rate down to 0.2 or so.
The # of iterations for plus training were also similar but they did not
give same accuracy (also, the traineddata file size is much smaller using
this method).


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVSCJMvs7OCgQhfS%3D_V0yURmiJ%2BNSD9bb6ZGaoN6Tt8Pg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [tesseract-ocr] Re: Dot Matrix Fonts and Tesseract's Connected Component Analysis

Reply via email to