Some links: http://stackoverflow.com/questions/10238765/contours-opencv-how-to-eliminate-small-contours-in-a-binary-image http://stackoverflow.com/questions/15628739/c-opencv-eliminate-smaller-contours http://www.pyimagesearch.com/2015/02/09/removing-contours-image-using-python-opencv/
I was entirely impressed by the bounding box method of contour removal, but I did find success with findContours: Just filter which are the contours you want to lose (in your case using height I would say) and replace the black pixels with white. http://docs.opencv.org/2.4/doc/tutorials/imgproc/shapedescriptors/find_contours/find_contours.html Looking at that text I would also consider doing some morphology to make the characters a bit stronger. I hope this helps On Saturday, March 5, 2016 at 3:25:06 AM UTC+8, Stephen Lambie wrote: > > What function of opencv would you use to do that? > > On Thursday, 3 March 2016 22:17:03 UTC-8, Meh Hem wrote: >> >> The is definitely tesseract api configs for that : >> textord_heavy_nr = 0 (0 default, 1 is *very* aggressive) >> textord_max_noise_size >> >> However I would simply use opencv to remove any blob with a vertical >> height of less than desired. >> >> >> On Monday, December 14, 2015 at 9:42:29 PM UTC+8, Filippo Riccio wrote: >>> >>> Hallo everybody, >>> >>> I am testing Tesseract to recognize the characters in the attached >>> Picture. >>> >>> I created a traineddata with a small number of characters. >>> >>> My Problem is that Tesseract recognizes as character also the small >>> lines at the left of the first 0 and >>> under J. Precisely, the recognized text is F0002HNJH2UF >>> >>> How can I avoid it? It is possible to fix the minimal size of characters? >>> >>> Thank you in advance. >>> >>> >>> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/407d6f78-9878-4962-89f9-5963d63719de%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

