Re: Bad Read?

Eugene Reimer Fri, 02 Jul 2010 13:04:21 -0700

The command-line tesseract on that image does produce two lines. Mindyou, the first line consists entirely of gibberish. Here's what I get:


.>’¢:>¢:>C)_§?
522960


That's on Linux with tesseract version 2.04 with the "eng" language-files.


Jimmy O'Regan wrote, On 2010-07-02 13:30:

Honestly, I've no idea -- that should have come out as two separate lines.
Would you mind opening an issue for this? I don't really have a lot oftime at the moment, and won't for the next few weeks, but if there'san open issue I'll be more likely to come back to it.


KAH wrote, On 2010-07-02 10:49:

I am trying to figure out why tesseract is not reading this image astwo lines?Is there a variable I can set that will let me tell the process to seethe vertical space as space and not treat it all as one word?
Here is the image I am trying to read:http://dl.dropbox.com/u/1531272/pg1-CROP.jpg
Thanks for any help you can offer as I try to tweak this awesome product.


--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Re: Bad Read?

Reply via email to