i want to know the same thing with you. but i can't find solution to it~

在 2011年8月25日星期四UTC+8上午2时40分41秒,bo gao写道:

> Anyone knows about it?
>
> On Tue, Aug 23, 2011 at 8:13 PM, bo gao <[email protected]<javascript:>
> > wrote:
>
>> In the training process, although I have all the files I need, I have 
>> some failure report: Couldn't find a matching blob.
>> Is that normal?
>>
>> Thanks!
>> ......
>> APPLY_BOXES: boxfile line 28970/g ((20496,698),(20515,729)): FAILURE! 
>> Couldn't f
>> ind a matching blob
>> APPLY_BOXES:
>>    Boxes read from boxfile:   29046
>>    Boxes failed resegmentation:     463
>> ......
>> APPLY_BOXES: Unlabelled word at :Bounding box=(5908,960)->(5944,971)
>> APPLY_BOXES: Unlabelled word at :Bounding box=(690,962)->(761,994)
>> APPLY_BOXES: Unlabelled word at :Bounding box=(2307,959)->(2345,972)
>>    Found 28583 good blobs and 1026 unlabelled blobs in 0 words.
>>    74 remaining unlabelled words deleted.
>> TRAINING ... Font name = arial
>> Generated training data for 5943 words
>>
>>
>> On Tue, Aug 23, 2011 at 7:32 PM, bo gao <[email protected]<javascript:>
>> > wrote:
>>
>>> Hi, All,
>>>
>>> For dictionary:
>>>
>>> I added dictionary for Tessearct 3, but I did not see the output changed.
>>>
>>> Then I try to turn up parameters as told in Wiki page:
>>>
>>> Try upping NON_WERD and GARBAGE_STRING in dict/permute.cpp to maybe 3 or 
>>> even 5.
>>>
>>> There is no NON_WERD and GARBAGE_STRING in dict/permute.cpp, should I 
>>> refer to segment_penalty_garbage  segment_penalty_dict_nonword in 
>>> dict/dict.h?
>>>
>>> How can I put more weights on dictionary?
>>>
>>> For training:
>>>
>>> I used the 32 tiff files, but after training the performance degrade, 
>>> and the traineddata is much smaller. How should I improve the performance? 
>>> Anyone trained a better classifier than provided eng,traineddata?
>>>
>>> Thanks!
>>> -- 
>>>
>>> Best,
>>>
>>> Bo
>>>  
>>
>>
>>
>> -- 
>>
>> Best,
>>
>> Bo
>>  
>
>
>
> -- 
>
> Best,
>
> Bo
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to