Sorry I am really noob When I do : tesseract pH_treshr.png - I have : Empty page!! Empty page!!
How do you achieve to have this image ? and why can't I tesseract it like you ? I am on buster with tesseract 5.1 is there a way to discuss ? discord ? thanks for your patience and help Le samedi 25 juin 2022 à 14:34:06 UTC+2, zdenop a écrit : > Sorry - I mean Rescaling: > > Tesseract works best on images which have a DPI of at least 300 dpi, so it > may be beneficial to resize images. For more information see the FAQ. > "Willus Dotkom" made interesting test for Optimal image resolution with > suggestion for optimal Height of capital letter in pixels: > https://groups.google.com/g/tesseract-ocr/c/Wdh_JJwnw94/m/24JHDYQbBQAJ > > > After that, you can get output (but the dot is missing) with the command > line: "tesseract pH_treshr.png -" > > I was able to get the decimal point separator with the letsgodigital data > file > https://github.com/arturaugusto/display_ocr/blob/master/letsgodigital/letsgodigital.traineddata > tesseract pH_treshr.png - -l letsgodigital > > Or have a look at SSD https://github.com/Shreeshrii/tessdata_ssd > > Zdenko > > > so 25. 6. 2022 o 12:17 Hervé <[email protected]> napísal(a): > >> I am on tesseract 5 >> >> Inverting images >> >> While tesseract version 3.05 (and older) handle inverted image (dark >> background and light text) without problem, for 4.x version use dark text >> on light background. >> isn'it the same than : >> (thresh, im_bw) = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY | >> cv2.THRESH_OTSU) >> im_bw = cv2.bitwise_not(im_bw) >> >> for resizing, I take my picture in full HD, do increasing resolution will >> allow tesseract to better OCR ? >> >> thanks >> >> >> Le samedi 25 juin 2022 à 11:25:50 UTC+2, zdenop a écrit : >> >>> Why you did not try more relevant hits like inverting and resizing? >>> >>> Zdenko >>> >>> >>> so 25. 6. 2022 o 10:56 Hervé <[email protected]> napísal(a): >>> >>>> I tried gray image, black and white, and I use >>>> >>>> custom_psm = r'--psm 7' >>>> >>>> didn't try others parameters >>>> Le samedi 25 juin 2022 à 10:32:14 UTC+2, zdenop a écrit : >>>> >>>>> >>>>> >>>>> so 25. 6. 2022 o 8:15 Hervé <[email protected]> napísal(a): >>>>> >>>>>> Hi >>>>>> I just tried some, without real success >>>>>> >>>>>> Please be specific: what did you try and what was the result? >>>>> >>>>> >>>>> >>>>>> could I learn digits from pictures ? maybe this font is not well >>>>>> recognized >>>>>> >>>>> >>>>> Any training is useless if the failure is at the image preprocessing >>>>> stage. >>>>> >>>>> >>>>>> thanks >>>>>> >>>>>> Le vendredi 24 juin 2022 à 17:12:44 UTC+2, zdenop a écrit : >>>>>> >>>>>>> Did try to implement suggestion from documentation? >>>>>>> https://github.com/tesseract-ocr/tessdoc/blob/main/ImproveQuality.md >>>>>>> >>>>>>> >>>>>>> Zdenko >>>>>>> >>>>>>> >>>>>>> pi 24. 6. 2022 o 16:59 Hervé <[email protected]> napísal(a): >>>>>>> >>>>>>>> Hi, I need some help to make tesseract-OCR recognize digits : can't >>>>>>>> achieve to make this work with >>>>>>>> >>>>>>>> >>>>>>>> https://img.super-h.fr/images/2022/06/24/9a03414616bc4c6bd6e4bdb78e9d6783.jpg >>>>>>>> >>>>>>>> >>>>>>>> here is my code : >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> import cv2 >>>>>>>> import pytesseract >>>>>>>> >>>>>>>> pytesseract.pytesseract.tesseract_cmd ="C:\\Program >>>>>>>> Files\\Tesseract-OCR\\tesseract.exe" >>>>>>>> >>>>>>>> def process_image(img): >>>>>>>> #cv2.imshow('Img',img) >>>>>>>> #cv2.waitKey(0) >>>>>>>> >>>>>>>> ### passage en niveau de gris >>>>>>>> gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) >>>>>>>> #cv2.imshow('Img',gray) >>>>>>>> #v2.waitKey(0) >>>>>>>> >>>>>>>> ###analyse de l'image >>>>>>>> valeur = pytesseract.image_to_string(gray) >>>>>>>> print(valeur) >>>>>>>> >>>>>>>> ##passage en noir et blanc >>>>>>>> (thresh, im_bw) = cv2.threshold(gray, 128, 255, >>>>>>>> cv2.THRESH_BINARY | cv2.THRESH_OTSU) >>>>>>>> im_bw = cv2.bitwise_not(im_bw) >>>>>>>> #cv2.imshow('Img',im_bw) >>>>>>>> #cv2.waitKey(0) >>>>>>>> # cv2.imwrite('ph.png',im_bw) >>>>>>>> print(pytesseract.image_to_string(im_bw)) >>>>>>>> >>>>>>>> >>>>>>>> ###ouverture de l'image >>>>>>>> img = cv2.imread('ocr5.png') >>>>>>>> # cv2.imshow('Img',imgcoupee) >>>>>>>> >>>>>>>> >>>>>>>> ###on rogne >>>>>>>> imgcoupee = img[1056:1517,950:1862] >>>>>>>> #img = cv2.imwrite('ocrcoupee.png',imgcoupee) >>>>>>>> # cv2.imshow('Img',imgcoupee) >>>>>>>> >>>>>>>> ### decoupage de la partie correspondant au PH >>>>>>>> ph= img[516:625, 616:815] >>>>>>>> >>>>>>>> #cv2.imwrite('pH.jpg', image_pH) >>>>>>>> >>>>>>>> ### partie chlore >>>>>>>> cl = img[516:625, 882:1056] >>>>>>>> >>>>>>>> ### partie dÃ:copyright:faut flow >>>>>>>> #flow= img[1302:1398,1054:1400] >>>>>>>> >>>>>>>> ### process >>>>>>>> #process_image(imgcoupee) >>>>>>>> process_image(ph) >>>>>>>> process_image(cl) >>>>>>>> #process_image(flow) >>>>>>>> >>>>>>>> digits seems to be clear enough, but it does'nt work, if someone >>>>>>>> could help me ? >>>>>>>> >>>>>>>> thanks ! >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "tesseract-ocr" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/a05712a5-e6ed-411f-a072-e389ea7095efn%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/a05712a5-e6ed-411f-a072-e389ea7095efn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> >>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/4ed81a73-0a82-426e-a35e-ba52c5ac71f1n%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/4ed81a73-0a82-426e-a35e-ba52c5ac71f1n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/eb2f2bdd-843d-4f11-83bb-d96e578ad94en%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/eb2f2bdd-843d-4f11-83bb-d96e578ad94en%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/68c4cc25-811d-41dd-b93a-b0df17d9b705n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/tesseract-ocr/68c4cc25-811d-41dd-b93a-b0df17d9b705n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f76dbe5d-d75d-4ef8-90c3-d36ae3898194n%40googlegroups.com.

