HI Ashwan, Gimp is your friend: https://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy
If your programming, use KalikoImage library to replicate manual GIMP steps, that’s easy. I found greyscale didn’t help. YES: Long line removal (may not apply to you) (OpenCV) YES: resize to 300DPI YES: Apply filters Hope this helps, Adrian From: [email protected] [mailto:[email protected]] On Behalf Of Ashwan Reddy Sent: 11 October 2018 14:23 To: tesseract-ocr <[email protected]> Subject: [tesseract-ocr] Getting time from image Hi, I'm trying to extract "8:56" from this image, which is cropped from a portion of a basketball broadcast. This command returns "757" using Tesseract 3.05, which is not the result I'm hoping for: tesseract myimage.jpg -c tessedit_char_whitelist=0123456789:. -c tessedit_write_images=1 -psm 7 stdout I've attached the tessinput image, which shows that the pre-processing steps basically remove the time entirely. Cropping the image to fit just the text area is not an option for my purposes unfortunately. Any ideas on how I could improve the result otherwise? Thanks! Ashwan -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]<mailto:[email protected]>. To post to this group, send email to [email protected]<mailto:[email protected]>. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/1a88bcb0-202c-420b-be6b-6e0e7a84258f%40googlegroups.com<https://groups.google.com/d/msgid/tesseract-ocr/1a88bcb0-202c-420b-be6b-6e0e7a84258f%40googlegroups.com?utm_medium=email&utm_source=footer>. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/1e90f9a41299432bbbe670c928299fe0%40eesm.com. For more options, visit https://groups.google.com/d/optout.

