When I use Tesseract with ONE_WORD option - during box creation - tess recognizes comma, but dot and ":" doesnt. Than Im inserting boxes for those signs. And result is as You can see on attached pic ...
On 26 Maj, 15:39, Joyse1 <[email protected]> wrote: > png, box, and apply_boxes msges You will find in attachment > > thanks in advance! > > > I think I know, what could be the issue here. Refer to > >http://code.google.com/p/tesseract-ocr/issues/detail?id=446&can=5. > > Despite your using another layout mode, this issue can still hold > > true. > > > In brief, for small images Tess confuses background and foreground > > pixels. That's why it treats characters' inner holes as characters and > > recognizes them as such. To avoid this you can try to add more > > characters to the training image or make corrections to the Tesseract > > code - I've indicated what should be done inside the issue. > > > However I might be wrong. To give more relevant advice I need to see > > your images, cmd line etc. > > > Warm regards, > > Dmitri Silaev > >www.CustomOCR.com > > > On Thu, May 26, 2011 at 5:30 AM, Joyse1<[email protected]> wrote: > >> Hi, > >> I have small font ( Microsoft Sans serif , 8, string to learn: " 0 1 2 > >> 3 4 > >> 5 6 7 8 9 . , : " ). I cant train single pixels recognition ( ex.: ".", > >> "," > >> , ":" ). I have failures when generating tr files. > >> I have two versions of tess: with layout analizator turned on, and > >> one_word_only option turned on. Only difference between them is that with > >> one word option ( PSM_ONE_WORD in tesseract ) - it generates box and > >> recognizes a comma . So i have failures ( "no blobs ..." ) only for "." > >> and > >> ":" ( with layout analizator turned on i have failures for three of them : > >> ". , :" ). I dont think that changing one_word option to single_char > >> could > >> help here. Please could somebody tell me what is a soution here ( without > >> resizing training images ). > > >> Best > >> Jakub > > >> -- > >> You received this message because you are subscribed to the Google > >> Groups "tesseract-ocr" group. > >> To post to this group, send email to [email protected] > >> To unsubscribe from this group, send email to > >> [email protected] > >> For more options, visit this group at > >>http://groups.google.com/group/tesseract-ocr?hl=en > > > > apply_boxes_info.PNG > 21KZobaczPobierz > > normal.box > < 1KWyświetlPobierz > > normal.PNG > 1KZobaczPobierz -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

