Some links:
http://stackoverflow.com/questions/10238765/contours-opencv-how-to-eliminate-small-contours-in-a-binary-image
http://stackoverflow.com/questions/15628739/c-opencv-eliminate-smaller-contours
http://www.pyimagesearch.com/2015/02/09/removing-contours-image-using-python-opencv/

I was entirely impressed by the bounding box method of contour removal, but 
I did find success with findContours: 

Just filter which are the contours you want to lose (in your case using 
 height I would say) and replace the black pixels with white.

http://docs.opencv.org/2.4/doc/tutorials/imgproc/shapedescriptors/find_contours/find_contours.html

Looking at that text I would also consider doing some morphology to make 
the characters a bit stronger. 

I hope this helps


On Saturday, March 5, 2016 at 3:25:06 AM UTC+8, Stephen Lambie wrote:
>
> What function of opencv would you use to do that?
>
> On Thursday, 3 March 2016 22:17:03 UTC-8, Meh Hem wrote:
>>
>> The is definitely tesseract api configs for that :
>> textord_heavy_nr = 0 (0 default, 1 is *very* aggressive)
>> textord_max_noise_size
>>
>> However I would simply use opencv to remove any blob with a vertical 
>> height of less than desired. 
>>
>>
>> On Monday, December 14, 2015 at 9:42:29 PM UTC+8, Filippo Riccio wrote:
>>>
>>> Hallo everybody,
>>>
>>> I am testing Tesseract to recognize the characters in the attached 
>>> Picture.
>>>
>>> I created a traineddata with a small number of characters.
>>>
>>> My Problem is that Tesseract recognizes as character also the small 
>>> lines at the left of the first 0 and
>>> under J. Precisely, the recognized text is F0002HNJH2UF
>>>
>>> How can I avoid it? It is possible to fix the minimal size of characters?
>>>
>>> Thank you in advance.
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/407d6f78-9878-4962-89f9-5963d63719de%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to