Nevermind, my bad. I understand now. On Wed, May 26, 2010 at 2:30 AM, haratron <[email protected]> wrote: > Character coordinates is also ok. But why are they so off? > > On Wed, May 26, 2010 at 2:21 AM, Jimmy O'Regan <[email protected]> wrote: >> On 25 May 2010 22:41, haratron <[email protected]> wrote: >>> I searched a lot and found this: >>> tesseract image.tif boxes batch.nochop makebox >>> >>> If I invoke that, i get a boxes.txt file with what appear to be >>> coordinates. But they are too large. I read somewhere that tesseract >>> computes the coordinates from the bottom of the image and not from the >>> top left corner (from the tests I did, this does not appear to be >>> valid). There are also two instances of the same word (same >>> combination of letters) appearing in boxes.txt, whereas the image >>> contains only one instance. >>> >>> Can anybody please shed some light here? >> >> The boxfiles are for characters, not words. They're used for training >> Tesseract. >> >> >> -- >> <Leftmost> jimregan, that's because deep inside you, you are evil. >> <Leftmost> Also not-so-deep inside you. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en. >> >> >
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

