As an alternative, you could try using the version 4 provided the image size is appropriate. I have used version 4 on raw images like these and the results came out unexpectedly well.
On Thu, Oct 11, 2018 at 7:17 PM Ashwan Reddy <[email protected]> wrote: > Thanks for the suggestions, Adrian. I'll give those a try. > > Ashwan > > On Thursday, October 11, 2018 at 9:42:49 AM UTC-4, Testing Windows > Screenshots wrote: >> >> HI Ashwan, >> >> >> >> Gimp is your friend: >> https://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy >> >> >> >> If your programming, use KalikoImage library to replicate manual GIMP >> steps, that’s easy. >> >> >> >> I found greyscale didn’t help. >> >> YES: Long line removal (may not apply to you) (OpenCV) >> >> YES: resize to 300DPI >> >> YES: Apply filters >> >> >> >> Hope this helps, Adrian >> >> >> >> *From:* [email protected] [mailto:[email protected]] *On >> Behalf Of *Ashwan Reddy >> *Sent:* 11 October 2018 14:23 >> *To:* tesseract-ocr <[email protected]> >> *Subject:* [tesseract-ocr] Getting time from image >> >> >> >> Hi, >> >> >> >> I'm trying to extract "8:56" from this image, which is cropped from a >> portion of a basketball broadcast. This command returns "757" using >> Tesseract 3.05, which is not the result I'm hoping for: >> >> >> >> tesseract myimage.jpg -c tessedit_char_whitelist=0123456789:. -c >> tessedit_write_images=1 -psm 7 stdout >> >> >> >> I've attached the tessinput image, which shows that the pre-processing >> steps basically remove the time entirely. Cropping the image to fit just >> the text area is not an option for my purposes unfortunately. Any ideas on >> how I could improve the result otherwise? >> >> >> >> Thanks! >> >> Ashwan >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at https://groups.google.com/group/tesseract-ocr. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/1a88bcb0-202c-420b-be6b-6e0e7a84258f%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/1a88bcb0-202c-420b-be6b-6e0e7a84258f%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/ec980b11-5b26-48df-892c-2f9a28b178a2%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/ec980b11-5b26-48df-892c-2f9a28b178a2%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- Regards, Soumik Ranjan Dasgupta -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAB_aDAe7bwfO0CXhbvSgzoEBWWbFUKFGfYUofNxtk5AQac%3DjMQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

