Anyone knows about it? On Tue, Aug 23, 2011 at 8:13 PM, bo gao <[email protected]> wrote:
> In the training process, although I have all the files I need, I have some > failure report: Couldn't find a matching blob. > Is that normal? > > Thanks! > ...... > APPLY_BOXES: boxfile line 28970/g ((20496,698),(20515,729)): FAILURE! > Couldn't f > ind a matching blob > APPLY_BOXES: > Boxes read from boxfile: 29046 > Boxes failed resegmentation: 463 > ...... > APPLY_BOXES: Unlabelled word at :Bounding box=(5908,960)->(5944,971) > APPLY_BOXES: Unlabelled word at :Bounding box=(690,962)->(761,994) > APPLY_BOXES: Unlabelled word at :Bounding box=(2307,959)->(2345,972) > Found 28583 good blobs and 1026 unlabelled blobs in 0 words. > 74 remaining unlabelled words deleted. > TRAINING ... Font name = arial > Generated training data for 5943 words > > > On Tue, Aug 23, 2011 at 7:32 PM, bo gao <[email protected]> wrote: > >> Hi, All, >> >> For dictionary: >> >> I added dictionary for Tessearct 3, but I did not see the output changed. >> >> Then I try to turn up parameters as told in Wiki page: >> >> Try upping NON_WERD and GARBAGE_STRING in dict/permute.cpp to maybe 3 or >> even 5. >> >> There is no NON_WERD and GARBAGE_STRING in dict/permute.cpp, should I >> refer to segment_penalty_garbage segment_penalty_dict_nonword in >> dict/dict.h? >> >> How can I put more weights on dictionary? >> >> For training: >> >> I used the 32 tiff files, but after training the performance degrade, and >> the traineddata is much smaller. How should I improve the performance? >> Anyone trained a better classifier than provided eng,traineddata? >> >> Thanks! >> -- >> >> Best, >> >> Bo >> > > > > -- > > Best, > > Bo > -- Best, Bo -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

