Re: [tesseract-ocr] Re: How to fix the code to remove the result lastly decided by Tesseract which size is smaller than specific blob size?

葉家忠 Mon, 22 Sep 2014 17:12:51 -0700

3.02... I've found the code snipet you said but can't have it executed..
2014. 9. 23. 오전 5:29에 "Paul" <[email protected]>님이 작성:


> Those sections are definitely run. Which version of Tesseract are you
> using?
>
> Am Montag, 1. September 2014 09:00:29 UTC+2 schrieb 葉家忠:
>>
>> Really thank you for kindly help~
>>
>> I try what you said above but get nothing changed,
>> When I traced the code in debug mode, I found the codes mentioned above
>> are never run once,
>> I wonder if there is any parameter I should set it true?
>>
>> please teach me more,
>> Thanks again~
>>
>>
>> 2014년 8월 30일 토요일 오후 6시 16분 9초 UTC+8, Paul 님의 말:
>>>
>>> I suffered from similar issues and fixed the problem by adding a line to
>>> textord/colfind.cpp:
>>>
>>> Between
>>>
>>> #endif // GRAPHICS_DISABLED
>>>
>>> and
>>>
>>> SetBlockRuleEdges(input_block);
>>>
>>> I added:
>>>
>>> input_block->noise_blobs.clear(); // remove noise blobs
>>>
>>> This will remove noise blobs during the segmentation of blocks and
>>> prevent noise blobs from being added to the text block around them. I think
>>> it is a dirty hack, but it will probably give you better results. Maybe we
>>> have to tackle this problem in a more in-depth solution in the future.
>>>
>>> Changing the constant
>>>
>>> const double kMinMediumSizeRatio = 0.25;
>>>
>>> to
>>>
>>> const double kMinMediumSizeRatio = 0.15;
>>>
>>> in blobbox.cpp also helped to improve the results. You can try to
>>> adjust that constant to your needs.
>>>
>>> Paul
>>>
>>>
>>> Am Donnerstag, 28. August 2014 10:04:33 UTC+2 schrieb 葉家忠:
>>>>
>>>> I use Tesseract to recognize the simplified chinese character
>>>>
>>>> Since some noise of the source image  can't be removed, so I decide to
>>>> fix the source code to remove the incorrect result.
>>>>
>>>> Since the each of the chinese charactor size is fix-sized, so the nose
>>>> can be found easily because its size will be much smaller than a normal
>>>> character.
>>>>
>>>> I've tried to set the parameter "textord_heavy_nr" to  true to remove
>>>> the noise, but it won't work  because in some case it will remove some
>>>> importart parts of a chinese character which is quite necessary to form a
>>>> complete chinese character
>>>>
>>>> Can any one tell me how to fix the code that remove the result lastly
>>>> decided by Tesseract which size is smaller than specific blob size?
>>>>
>>>> I really thank you for helping~
>>>>
>>>>
>>>>
>>>> ps: the attached file show 3 characters but it will be recognized as 4
>>>> characters because of the noise.
>>>>
>>>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "tesseract-ocr" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/tesseract-ocr/M80Et5GOZXA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/20489d50-3464-427d-b599-896f519d5599%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/20489d50-3464-427d-b599-896f519d5599%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CA%2Bmyy67sO74ukZAk1YcTvcZ%3Dng5QoLjxi6OowjKiiVMqfh_gvg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [tesseract-ocr] Re: How to fix the code to remove the result lastly decided by Tesseract which size is smaller than specific blob size?

Reply via email to