Have you tried out the training using these .tr files? Does it work?
If so, you can probably just ignore the errors. ISTR those errors
came up in some of my trainings, but they didn't seem to affect
things much - as long as you have multiple samples of each character
you need you should be OK.

Nick

On Wed, Oct 31, 2012 at 09:42:24AM -0700, Reza M wrote:
> Hi Nick,
> I changed word space and line space but it has this warnings
> 
> C:\Program Files (x86)\Tesseract-OCR>tesseract per.arial.exp0.tif
> per.arial.exp0
>  nobatch box.train
> Tesseract Open Source OCR Engine v3.02 with Leptonica
> row xheight=30, but median xheight = 21
> row xheight=10, but median xheight = 21
> row xheight=29, but median xheight = 21
> FAIL!
> APPLY_BOXES: boxfile line 152/┌ء ((1310,127),(1327,181)): FAILURE! Couldn't
> find
>  a matching blob
> FAIL!
> APPLY_BOXES: boxfile line 153/╪د ((1280,127),(1293,181)): FAILURE! Couldn't
> find
>  a matching blob
> FAIL!
> APPLY_BOXES: boxfile line 154/┘╛┘† ((1219,127),(1263,181)): FAILURE! Couldn't
> fi
> nd a matching blob
> FAIL!
> APPLY_BOXES: boxfile line 155/╪ذ╪▒ ((1158,127),(1189,181)): FAILURE! Couldn't
> fi
> nd a matching blob
> FAIL!
> APPLY_BOXES: boxfile line 156/گز ((1102,127),(1141,181)): FAILURE! Couldn't
> fi
> nd a matching blob
> FAIL!
> APPLY_BOXES: boxfile line 157/╪د ((1073,127),(1086,181)): FAILURE! Couldn't
> find
>  a matching blob
> FAIL!
> APPLY_BOXES: boxfile line 158/╪▒ ((1038,127),(1055,181)): FAILURE! Couldn't
> find
>  a matching blob
> FAIL!
> APPLY_BOXES: boxfile line 159/شد ((945,127),(1009,181)): FAILURE! Couldn't
> fin
> d a matching blob
> FAIL!
> APPLY_BOXES: boxfile line 160/، ((911,127),(928,181)): FAILURE! Couldn't find
> a
>  matching blob
> FAIL!
> APPLY_BOXES: boxfile line 191/╪ذ┘ç ((188,67),(225,121)): FAILURE! Couldn't 
> find
> a matching blob
> APPLY_BOXES: boxfile line 201/╪┤█î┌ر ((649,7),(744,61)): FAILURE! Couldn't 
> find
> a matching blob
> APPLY_BOXES:
>    Boxes read from boxfile:     202
>    Boxes failed resegmentation:      41
>    Found 161 good blobs.
>    Leaving 12 unlabelled blobs in 0 words.
> TRAINING ... Font name = arial
> Generated training data for 26 words
> 
> I attached box, image and screenshot
> I tested two different type (connected and not connected) charters also I
> changed font size from 10 to 40!
> also I changed line space form 1 to 4 and word space from 0 to 20!
> all of them had this warning.
> yours,
> Reza
> 
> 
> 
> 
> 
> 
> 
> On Wednesday, October 31, 2012 12:43:31 PM UTC+1, Nick White wrote:
> 
>     Hi Reza,
> 
>     In the future could you copy the text of error messages, rather than
>     attach screenshots? It's easier to read, and means it will be
>     indexed for others who may search for the same issues.
> 
>     > 1-what is xheight?
> 
>     xheight is the height of a lowercase x in English. It's generally
>     the height that most characters are expected, so I think it's used
>     extensively in Tesseract for line finding and similar.
> 
>     > when I write tesseract per.arial.exp0.tif per.arial.exp0 nobatch
>     box.train
>     > .stderr  it shows this warning (attached image)
> 
>     I suspect those APPLY_BOXES failures are due to your training images
>     not having enough space between the characters. Try to space them
>     out a bit and see if that helps.
> 
>     Nick
> 
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en


-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to