Hi Shree, Thanks for the files! That's interesting that you tried replacing the top layer. I haven't tried that yet. How many iterations did you use?
I was thinking today that it is difficult to create a single strong learner with tesseract because training from scratch requires so much data. However, with fine-tuning, it is easy to create a lot of weak learners. I am wondering if you know of any successes of an ensemble model with tesseract. Thanks again, Ameera On Friday, March 22, 2019 at 12:11:11 AM UTC-7, [email protected] wrote: > > I am trying to fine-tune Tesseract for dot-matrix fonts such as that in > the picture below. When the dots are closely spaced together and touch, > Tesseract can more or less handle the dot-matrix font with some fine-tuning > and image processing. However, when the dots do not touch, as in the > picture below, Tesseract struggles. > > > I read in An Overview of the Tesseract OCR Engine > <https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/33418.pdf> > that > the first step in Tesseract's processing pipeline is a connected component > analysis (second paragraph of Section 2). Since the letters in a > dot-matrix font do not form connected components, I am wondering if > Tesseract's connected component analysis may be one reason that Tesseract > struggles on the image below. > > > Is there a command to see how Tesseract performs connected component > analysis on this image? > > > [image: ex_20.jpg] > > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/994d25d4-d295-448b-b1b6-033852825780%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

