You need to crop text area: [image: 6itc_cleanup_cropped.jpg] tesseract 6itc_cleanup_cropped.jpg - --dpi 300 TH10-50%L
Zdenko ut 10. 5. 2022 o 14:12 boyapally srikanth <[email protected]> napĂsal(a): > <https://stackoverflow.com/posts/72185956/timeline> > > I have been working on project which involves extracting text from an > image. I have researched that tesseract is one of the best libraries > available and I decided to use the same along with opencv. Opencv is > needed for image manipulation. > > I have been playing a lot with tessaract engine and it does not seems to > be giving the expected results to me. I have attached the sample image as > an reference. Output I got is: > > 1] =501 [ > > Instead, expected output is > > TM10-50%L > > What I have done so far: > > - Remove noise > - Adaptive threshold > - Sending it tesseract ocr engine > > Are there any other suggestions to improve the algorithm? > > Thanks in advance. > > Snippet of the code: > import cv2 > import sys > import pytesseract > import numpy as np > from PIL import Image > if __name__ == '__main__': i > f len(sys.argv) < 2: > print('Usage: python ocr_simple.py image.jpg') > sys.exit(1) > # Read image path from command line > imPath = sys.argv[1] > gray = cv2.imread(imPath, 0) > # Blur > blur = cv2.GaussianBlur(gray,(9,9), 0) > # Binarizing thres = cv2.adaptiveThreshold(blur, 255, > cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 5, 3) > text = pytesseract.image_to_string(thresh) > print(text) > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/73c2c2e1-431b-4343-9bb8-091286065159n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/73c2c2e1-431b-4343-9bb8-091286065159n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8y7kmP-8zH1K8xXrrhLzdz43s2tecmM4EParBB6zkV57w%40mail.gmail.com.

