I am trying to identify an simple equation from an image. I have been
trying to treat the image but it still does not give me a good result
(sometimes it does not detect the math operator, sometimes it confuses the
+ with 4).
Can someone give me a tip? I already tried a lot of things.
Here is my current code:
import cv2
import pytesseract
import numpy as np
img = cv2.imread("image.png")
img = cv2.resize(img, None, fx=1.2, fy=1.2,
interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
img = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY_INV +
cv2.THRESH_OTSU)[1]
img = cv2.bitwise_not(img)
kernel = np.ones((1, 1), np.uint8)
img = cv2.erode(img, kernel, iterations=1)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 1))
img = cv2.dilate(img, kernel, iterations=1)
content = pytesseract.image_to_string(img, lang="eng+equ",
config="--psm 13 -c tessedit_char_whitelist=0123456789+=")
print(content) # prints 23431 instead of 23 + 31
Here is the image [image: captcha.png] Thank you!
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/411a628b-05f2-4fa8-9930-30cdedfbf49fn%40googlegroups.com.