Yes, Zdenko, I know. But people have to call it some way. I use commit 
number 648e7ca31109 of the current master branch.

Paul

Am Mittwoch, 24. September 2014 13:10:15 UTC+2 schrieb zdenop:
>
> You do not use 3.04 version ;-)
> There is development version of tesseract marked that way, but it is not 
> finished yet (AFAIK Ray commit some changes).
>
> Zdenko
>
> On Wed, Sep 24, 2014 at 10:56 AM, Paul <[email protected] <javascript:>> 
> wrote:
>
>> I use 3.04. You may try to upgrade.
>>
>> Am Dienstag, 23. September 2014 02:12:17 UTC+2 schrieb 葉家忠:
>>>
>>> 3.02... I've found the code snipet you said but can't have it executed.. 
>>> 2014. 9. 23. 오전 5:29에 "Paul" <[email protected]>님이 작성:
>>>
>>>> Those sections are definitely run. Which version of Tesseract are you 
>>>> using?
>>>>
>>>> Am Montag, 1. September 2014 09:00:29 UTC+2 schrieb 葉家忠:
>>>>>
>>>>> Really thank you for kindly help~
>>>>>
>>>>> I try what you said above but get nothing changed, 
>>>>> When I traced the code in debug mode, I found the codes mentioned 
>>>>> above are never run once, 
>>>>> I wonder if there is any parameter I should set it true?
>>>>>
>>>>> please teach me more, 
>>>>> Thanks again~ 
>>>>>
>>>>>
>>>>> 2014년 8월 30일 토요일 오후 6시 16분 9초 UTC+8, Paul 님의 말:
>>>>>>
>>>>>> I suffered from similar issues and fixed the problem by adding a line 
>>>>>> to textord/colfind.cpp:
>>>>>>
>>>>>> Between
>>>>>>
>>>>>> #endif // GRAPHICS_DISABLED
>>>>>>
>>>>>> and
>>>>>>
>>>>>> SetBlockRuleEdges(input_block);
>>>>>>
>>>>>> I added:
>>>>>>
>>>>>> input_block->noise_blobs.clear(); // remove noise blobs
>>>>>>
>>>>>> This will remove noise blobs during the segmentation of blocks and 
>>>>>> prevent noise blobs from being added to the text block around them. I 
>>>>>> think 
>>>>>> it is a dirty hack, but it will probably give you better results. Maybe 
>>>>>> we 
>>>>>> have to tackle this problem in a more in-depth solution in the future.
>>>>>>
>>>>>> Changing the constant
>>>>>>
>>>>>> const double kMinMediumSizeRatio = 0.25;
>>>>>>
>>>>>> to
>>>>>>
>>>>>> const double kMinMediumSizeRatio = 0.15;
>>>>>>
>>>>>> in blobbox.cpp also helped to improve the results. You can try to 
>>>>>> adjust that constant to your needs.
>>>>>>
>>>>>> Paul
>>>>>>
>>>>>>
>>>>>> Am Donnerstag, 28. August 2014 10:04:33 UTC+2 schrieb 葉家忠:
>>>>>>>
>>>>>>> I use Tesseract to recognize the simplified chinese character
>>>>>>>
>>>>>>> Since some noise of the source image  can't be removed, so I decide 
>>>>>>> to fix the source code to remove the incorrect result.
>>>>>>>
>>>>>>> Since the each of the chinese charactor size is fix-sized, so the 
>>>>>>> nose can be found easily because its size will be much smaller than a 
>>>>>>> normal character. 
>>>>>>>
>>>>>>> I've tried to set the parameter "textord_heavy_nr" to  true to 
>>>>>>> remove the noise, but it won't work  because in some case it will 
>>>>>>> remove 
>>>>>>> some importart parts of a chinese character which is quite necessary to 
>>>>>>> form a complete chinese character
>>>>>>>
>>>>>>> Can any one tell me how to fix the code that remove the result 
>>>>>>> lastly decided by Tesseract which size is smaller than specific blob 
>>>>>>> size?
>>>>>>>
>>>>>>> I really thank you for helping~
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ps: the attached file show 3 characters but it will be recognized as 
>>>>>>> 4 characters because of the noise. 
>>>>>>>
>>>>>>  -- 
>>>> You received this message because you are subscribed to a topic in the 
>>>> Google Groups "tesseract-ocr" group.
>>>> To unsubscribe from this topic, visit https://groups.google.com/d/
>>>> topic/tesseract-ocr/M80Et5GOZXA/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to 
>>>> [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/tesseract-ocr/20489d50-3464-427d-b599-896f519d5599%
>>>> 40googlegroups.com 
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/20489d50-3464-427d-b599-896f519d5599%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/8f73cabc-acc2-4169-a358-1866e4d04afa%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/8f73cabc-acc2-4169-a358-1866e4d04afa%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/49239d91-5822-476c-963f-fc3f9824ae25%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to