Hi Ray,

 

Okay, thanks.

 

-

Albert 

 

From: [email protected] [mailto:[email protected]] On 
Behalf Of Ray Smith
Sent: Monday, February 09, 2009 18:17
To: [email protected]
Subject: Re: How do coordinates work?

 

Don't know. Is it all boxes that have negative coords or just some? Do any of 
the coords match the image?

This api is not thread-safe and is likely to have to change to the TessbaseAPI 
at some point in the future.

You might be better off in the longer term extending the TessBaseAPI class and 
using TesseractExtractResult, but bear in mind that
the thread-safety changes in 3.00 are going to change it from having static 
methods to a class instance-based api.

Ray.

On Fri, Feb 6, 2009 at 9:30 AM, Albert Law <[email protected]> wrote:


Hi Ray,

I get the same problem with ETEXT_DESC->text returning negative values for the 
position.  But I'm calling
TessDllAPI::Recognize_all_Words().  Does that make any sense?  If so, how do I 
get those coordinates back to pixel world given only
the result of TessDllAPI::Recognize_all_Words() ?  Thanks!

On Nov 19 2007, 5:55 pm, "Ray Smith" <[email protected]> wrote:
> Look at the function ConvertWordToBoxText in ccmain/baseapi.cpp.
> It sounds like you are not calling baseline_denormalise to convert the
> coordinates from normalized back to original pixel coordinates.
> The alternatives from the classifier from the current segmentation are
> stored, but alternative segmentations are not.
> Ray.
>
> On Nov 2, 2007 1:07 PM, JussiP <[email protected]> wrote:
>
>
>
> > Hi
>
> > I want to extract the locations of letters recognized by Tesseract. I
> > also want a list of all considered letter choices rather than just the
> > best one. A thread here showed that you can access this information
> > from the function classify_blob in wordrec/wordclass.cpp.
>
> > I tried calculating the bounding box of the TBLOB using the function
> > blob_bounding_box and then printing that. The coordinates I get make
> > no sense. I get letters that are hundreds of "elements" wide, and
> > consecutive letters go all over the page, I even getnegative
> > coordinates for some letters.
>
> > Does Tesseract use some funky coordinate system? If yes, how can it be
> > returned to pixel coordinates?
>
> > Is the bounding box function the correct way to do this? There seems
> > to be an another bounding box function as well, but that one is in the
> > API files.
>
> > Does the final PAGE_RES structure hold the various letter choices
> > somewhere or is only the best match preserved?
>
> > Thanks for your comments.


-
Albert





 




--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to