Thanks for the suggestions, Adrian. I'll give those a try. Ashwan
On Thursday, October 11, 2018 at 9:42:49 AM UTC-4, Testing Windows Screenshots wrote: > > HI Ashwan, > > > > Gimp is your friend: > https://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy > > > > If your programming, use KalikoImage library to replicate manual GIMP > steps, that’s easy. > > > > I found greyscale didn’t help. > > YES: Long line removal (may not apply to you) (OpenCV) > > YES: resize to 300DPI > > YES: Apply filters > > > > Hope this helps, Adrian > > > > *From:* [email protected] <javascript:> [mailto: > [email protected] <javascript:>] *On Behalf Of *Ashwan Reddy > *Sent:* 11 October 2018 14:23 > *To:* tesseract-ocr <[email protected] <javascript:>> > *Subject:* [tesseract-ocr] Getting time from image > > > > Hi, > > > > I'm trying to extract "8:56" from this image, which is cropped from a > portion of a basketball broadcast. This command returns "757" using > Tesseract 3.05, which is not the result I'm hoping for: > > > > tesseract myimage.jpg -c tessedit_char_whitelist=0123456789:. -c > tessedit_write_images=1 -psm 7 stdout > > > > I've attached the tessinput image, which shows that the pre-processing > steps basically remove the time entirely. Cropping the image to fit just > the text area is not an option for my purposes unfortunately. Any ideas on > how I could improve the result otherwise? > > > > Thanks! > > Ashwan > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To post to this group, send email to [email protected] > <javascript:>. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/1a88bcb0-202c-420b-be6b-6e0e7a84258f%40googlegroups.com > > <https://groups.google.com/d/msgid/tesseract-ocr/1a88bcb0-202c-420b-be6b-6e0e7a84258f%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ec980b11-5b26-48df-892c-2f9a28b178a2%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

