Really? And you thing your image fits to that examples? E.g. texts are in the line, there is not noise - just the text, DPI is OK etc???
You will never get good output from bad input. Zdenko On Fri, May 5, 2017 at 10:31 AM, anita josic <[email protected]> wrote: > Hi > > I read it now, but still don't know what I need to use. I already read a > lot but I still don't know what part is missing. I am hoping for real > feedback and help. I am not really coming forward trying stuff on my own as > you can see. > > Am Freitag, 5. Mai 2017 09:23:58 UTC+2 schrieb zdenop: >> >> Did you read https://github.com/tesseract-ocr/tesseract/wiki/Improve >> Quality? >> >> Zdenko >> >> On Fri, May 5, 2017 at 9:10 AM, anita josic <[email protected]> wrote: >> >>> >>> <https://lh3.googleusercontent.com/-OmlROZ0oDU8/WQwkpyPuSiI/AAAAAAAAF0Y/K_vAR52DRMEfruiqxCObmEEk0HA1tuS3wCLcB/s1600/IMG_20170504_200627.jpg> >>> Hello >>> >>> I am trying to extract text from a picture, but I always geht an empty >>> text. >>> The used picture in the code for image_to_string('temp2.jpg') is added >>> below. >>> I tried to treshold with opencv, but there was just a slice difference >>> to the picture added below. >>> >>> Is there a step missing? is the picture format jpg wrong? is it >>> impossible because of white and balck fields appearing as text on the >>> picture ..? >>> >>> I am urgently searching for help and hoping for an answer in short time. >>> >>> #!/usr/bin/env python >>> import os >>> import subprocess >>> from picamera.array import PiRGBArray >>> from time import * >>> from picamera import PiCamera >>> from datetime import datetime, timedelta >>> import cv2 >>> try: >>> import Image >>> except ImportError: >>> from PIL import Image, ImageEnhance, ImageFilter >>> from pytesseract import * >>> >>> #EXTRACT TEXT >>> print 'pytesser:' >>> #img = Image.open('/home/pi/camera/IMAGE-2017-05-04_141433.png') >>> img = Image.open('artikelbild-02.jpg') >>> im = img.convert('RGBA') >>> enhancer = ImageEnhance.Contrast(im) >>> im = enhancer.enhance(3) >>> im = im.convert('1') >>> im.save('temp2.jpg') >>> >>> #use tesseract library to extract text from >>> text = pytesseract.image_to_string(Image.open('temp2.jpg')) >>> >>> print "Text:"+text >>> >>> #what the text contains >>> if "DHL" in text: >>> print 'DHL Lieferant' >>> elif "Post" in text: >>> print 'Postbote' >>> elif "GLS" in text: >>> >>> .... >>> >>> >>> >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>> To view this discussion on the web visit https://groups.google.com/d/ms >>> gid/tesseract-ocr/e97baa76-1ee5-49af-b824-766ab2ec0b03%40goo >>> glegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/e97baa76-1ee5-49af-b824-766ab2ec0b03%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/tesseract-ocr/47b1ce8d-82f7-45e6-8680-b646e362e739% > 40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/47b1ce8d-82f7-45e6-8680-b646e362e739%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wAWOPQaJvRFAmMO_jZGG9BiVwp%2BnPBpvhx8aOn%3D6Ed3A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

