<https://lh4.googleusercontent.com/-NXtPPHdGSV0/UK9E4BcIewI/AAAAAAAAAAM/vAhsW5-Lw0c/s1600/EGG1.jpg> Sure. It has been replicated to protect data, but the effect was the same. The top part of the image is an extract of the text with the underscore, the lower part is the same text with the underscores removed. Many thanks for your help with this. This effect happened in all the fonts i use but this test was with arial (anyone know a solution to the capital I and lowercase L issue?)
On Thursday, November 22, 2012 12:26:23 PM UTC, davebt wrote: > 1. 3.02 suggests it improves the accuracy, but am I correct in assuming > that because there is no VB wrapper in the 3.02 update it is rendered > unusable in certain circumstances? > 2. I am having an issue with the OCR reading a block of text where the > inclusion of an underscore or high brackets confuses the reader and returns > gibberish as it no longer understands where the top of the text is (making > it give the next best alternative for what it thinks the character is). I > thought it was a line spacing issue, but when I used double line spacing I > still return the same results. As OCR can read languages with lines both > above and below characters I cannot believe it is not a simple mistake I am > making. Is it to do with the training? Any ideas? > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

