Eugene Reimer wrote, On 2009-06-23 23:11:

> Thanks Ray.  However I'm unable to accept your explanation of those 
> "box overlaps no blobs or blobs in multiple rows" messages.  The first 
> of those in my boxfile occurs for the "." line reproduced here 
> together with all its neighbours:
>
> v 2647 5678 2689 5726
> e 2690 5679 2732 5725
>
> s 2638 5577 2675 5624
> . 2676 5577 2694 5593
> s 2727 5575 2762 5621
>
> a 2664 5474 2703 5523
>
> You'll notice that its box does not overlap the box of any neighbour.
>
> Another reason why I'm not convinced that overlapping boxes is what 
> the program is complaining about:  the distributed training package 
> for German (boxtiff-2.01.deu.tar.gz), contains a boxfile for the 
> arialbi font which does have outright overlapping adjacent boxes, for 
> the adjacent characters "{j", whose lines are:
> { 2759 3073 2777 3111 0
> j 2776 3073 2795 3111 0
> where the box for "{" ends at x:2777, and the one for "j" begins at 
> x:2776.  And yet that boxfile appears to have been acceptable, and to 
> have produced usable training info.
>
> One thing that's unusual about the boxes being complained about is 
> that each has a y-upper-bound considerably lower than the other 
> characters in the same row.  AHA, revising those y-upper-bounds 
> upwards to agree with its same-row neighbours gets rid of those 
> complaints!!  Who would have thought it?
>
>
> Ray Smith wrote, On 2009-06-23 11:37:
>
>> I put the answers to these questions on the training page.
>> Ray.
>


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to tesseract-ocr@googlegroups.com
To unsubscribe from this group, send email to 
tesseract-ocr+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to