I think I know, what could be the issue here. Refer to
http://code.google.com/p/tesseract-ocr/issues/detail?id=446&can=5.
Despite your using another layout mode, this issue can still hold
true.

In brief, for small images Tess confuses background and foreground
pixels. That's why it treats characters' inner holes as characters and
recognizes them as such. To avoid this you can try to add more
characters to the training image or make corrections to the Tesseract
code - I've indicated what should be done inside the issue.

However I might be wrong. To give more relevant advice I need to see
your images, cmd line etc.

Warm regards,
Dmitri Silaev
www.CustomOCR.com





On Thu, May 26, 2011 at 5:30 AM, Joyse1 <[email protected]> wrote:
> Hi,
>   I have small font ( Microsoft Sans serif , 8, string to learn: " 0 1 2 3 4
> 5 6 7 8 9 . , : " ). I cant train single pixels recognition ( ex.:  ".", ","
> , ":" ). I have failures when generating tr files.
> I have two versions of tess: with layout analizator turned on, and
> one_word_only option turned on. Only difference between them is that with
> one word  option ( PSM_ONE_WORD  in tesseract )  - it generates box and
> recognizes a comma . So i have failures ( "no blobs ..." )  only for "." and
> ":" ( with layout analizator turned on i have failures for three of them :
>  ". , :" ). I dont think that changing one_word option to single_char could
> help here. Please could somebody tell me what is a soution here ( without
> resizing training images ).
>
> Best
> Jakub
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to