Hello,

http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract:

It is *ABSOLUTLEY VITAL* to space out the text a bit when printing, so up
the inter-character and inter-line spacing in your word processor. Not
spacing text out sufficiently will cause "FAILURE! box overlaps no blobs or
blobs in multiple rows" errors during tr file generation, which leads to
FATALITY - 0 labelled samples of "x", which leads to "Error: X classes in
inttemp while unicharset contains Y unichars" and you can't use your nice
new data files. This situation will improve in the future, as we are working
on a solution, but for 3.00 APPLY_BOXES errors remain the most problematic
difficulty for people training tesseract.


so you do not need to create space between words, but it is important you
have enough space between character.

Best regards,

Zdenko

2010/5/1 M. Bashir Al-Noimi <[email protected]>

>  Hello again,
>
> Does any one know the answer? I trained Tess and I didn't find any
> difference.
>
> On 30/04/2010 10:18 م, M. Bashir Al-Noimi wrote:
>
> Hi All,
>
> As I noticed in Traning image for English, French and Dutch all the
> charecters nearly groups as words, so I'm asking grouping character in
> training image does it affect on recognition process ?
>
> for example in eng.arial.g4.tif I captured the following chop:
>
> [image: eng.arial.g4.png]
>
> if I input the following chop does it give me same reconnection result just
> like the above?
>
> [image: eng.arial.g4_test.png]
>
>
> --
> Best Regards
> Muhammad Bashir Al-Noimi
> My Blog: http://mbnoimi.net
>
>  --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<tesseract-ocr%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

<<image/png>>

<<image/png>>

Reply via email to