Hi Ray, I get the same problem with ETEXT_DESC->text returning negative values for the position. But I'm calling TessDllAPI::Recognize_all_Words(). Does that make any sense? If so, how do I get those coordinates back to pixel world given only the result of TessDllAPI::Recognize_all_Words() ? Thanks!
On Nov 19 2007, 5:55 pm, "Ray Smith" <[email protected]> wrote: > Look at the function ConvertWordToBoxText in ccmain/baseapi.cpp. > It sounds like you are not calling baseline_denormalise to convert the > coordinates from normalized back to original pixel coordinates. > The alternatives from the classifier from the current segmentation are > stored, but alternative segmentations are not. > Ray. > > On Nov 2, 2007 1:07 PM, JussiP <[email protected]> wrote: > > > > > Hi > > > I want to extract the locations of letters recognized by Tesseract. I > > also want a list of all considered letter choices rather than just the > > best one. A thread here showed that you can access this information > > from the function classify_blob in wordrec/wordclass.cpp. > > > I tried calculating the bounding box of the TBLOB using the function > > blob_bounding_box and then printing that. The coordinates I get make > > no sense. I get letters that are hundreds of "elements" wide, and > > consecutive letters go all over the page, I even getnegative > > coordinates for some letters. > > > Does Tesseract use some funky coordinate system? If yes, how can it be > > returned to pixel coordinates? > > > Is the bounding box function the correct way to do this? There seems > > to be an another bounding box function as well, but that one is in the > > API files. > > > Does the final PAGE_RES structure hold the various letter choices > > somewhere or is only the best match preserved? > > > Thanks for your comments. - Albert --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

