Thank you so much for your help! Inverting the images did wonders. I also resized them further, and did some filtering to try and get a little bit of the pixelation to go away. Since I'm using some Python bindings instead of the command line I didn't have immediate access to the other options you used. I think I'll add those into the bindings and submit a pull request. Thanks again!
Cheers, Sean On Thursday, November 19, 2015 at 10:44:52 AM UTC-8, Dmitri Silaev wrote: > > For "debug.png", I'd suggest first inverting the image, then running > Tesseract in the single text line segmentation mode (7), or modes 8/10. > > For "debug2.png", running Tesseract with the "-psm 7" option is enough but > I advise to invert all such images because Tess often may confuse > foreground and background pixels - usually foreground is black. > > Example command line: tesseract debug_i.png debug_i.png -psm 7 > > Tested with Tess executable built as of 20150203. > > Best regards, > Dmitri Silaev > www.CustomOCR.com > > > > > > On Thu, Nov 19, 2015 at 8:04 PM, Sean Leffler <[email protected] > <javascript:>> wrote: > >> Hello! I'm new to Tesseract and I'm trying to use it to read text which >> will always be similar to these three images (always in the same font and >> with similar, relatively noise-free backgrounds.) I'm aware the images are >> very small; however, what puzzles me is that Tesseract seems to be >> perfectly fine with the image I've attached under the name "working.png". >> For the other two, it fails to detect any text. Is there anything I could >> do to improve this? I've tried scaling up the images and that didn't seem >> to do anything. Thanks in advance! >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at http://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/5fe49e71-5f55-4fb2-b1af-3097ceee4bc7%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/5fe49e71-5f55-4fb2-b1af-3097ceee4bc7%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/1ae28cc4-a10d-45f1-a3ce-d8db6c11837c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

