Have you tried whitelisting the characters you're looking for? https://stackoverflow.com/questions/2363490/limit-characters-tesseract-is-looking-for
tesseract input_file output_file --oem 0 -c tessedit_char_whitelist=1234567890 On Sunday, 6 September 2020 at 22:14:34 UTC+2 [email protected] wrote: > I'm trying to extract just the numbers '1' and '2', it works if I crop out > just the digits and feed it through my code, but if I include the heading > "WEEK", it doesn't detect the numbers. > I tried all the page segmentation methods (0-10). Could someone please > help asap! > THE TEXT IM TRYING TO EXTRACT IS ATTACHED BELOW! > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/91d6c454-96bd-41ff-9e3b-af4fe7fb1821n%40googlegroups.com.

