You need to crop text area:
[image: 6itc_cleanup_cropped.jpg]
tesseract 6itc_cleanup_cropped.jpg - --dpi 300
TH10-50%L

Zdenko


ut 10. 5. 2022 o 14:12 boyapally srikanth <[email protected]>
napĂ­sal(a):

> <https://stackoverflow.com/posts/72185956/timeline>
>
> I have been working on project which involves extracting text from an
> image. I have researched that tesseract is one of the best libraries
> available and I decided to use the same along with opencv. Opencv is
> needed for image manipulation.
>
> I have been playing a lot with tessaract engine and it does not seems to
> be giving the expected results to me. I have attached the sample image as
> an reference. Output I got is:
>
> 1] =501 [
>
> Instead, expected output is
>
> TM10-50%L
>
> What I have done so far:
>
>    - Remove noise
>    - Adaptive threshold
>    - Sending it tesseract ocr engine
>
> Are there any other suggestions to improve the algorithm?
>
> Thanks in advance.
>
> Snippet of the code:
> import cv2
> import sys
> import pytesseract
>  import numpy as np
>  from PIL import Image
>  if __name__ == '__main__': i
>      f len(sys.argv) < 2:
>           print('Usage: python ocr_simple.py image.jpg')
>           sys.exit(1)
>      # Read image path from command line
>      imPath = sys.argv[1]
>      gray = cv2.imread(imPath, 0)
>      # Blur
>       blur = cv2.GaussianBlur(gray,(9,9), 0)
>      # Binarizing thres = cv2.adaptiveThreshold(blur, 255,
> cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 5, 3)
>      text = pytesseract.image_to_string(thresh)
>      print(text)
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/73c2c2e1-431b-4343-9bb8-091286065159n%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/73c2c2e1-431b-4343-9bb8-091286065159n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8y7kmP-8zH1K8xXrrhLzdz43s2tecmM4EParBB6zkV57w%40mail.gmail.com.

Reply via email to