Sorry - I mean Rescaling: Tesseract works best on images which have a DPI of at least 300 dpi, so it may be beneficial to resize images. For more information see the FAQ. "Willus Dotkom" made interesting test for Optimal image resolution with suggestion for optimal Height of capital letter in pixels: https://groups.google.com/g/tesseract-ocr/c/Wdh_JJwnw94/m/24JHDYQbBQAJ
After that, you can get output (but the dot is missing) with the command line: "tesseract pH_treshr.png -" I was able to get the decimal point separator with the letsgodigital data file https://github.com/arturaugusto/display_ocr/blob/master/letsgodigital/letsgodigital.traineddata tesseract pH_treshr.png - -l letsgodigital Or have a look at SSD https://github.com/Shreeshrii/tessdata_ssd Zdenko so 25. 6. 2022 o 12:17 Hervé <[email protected]> napísal(a): > I am on tesseract 5 > > Inverting images > > While tesseract version 3.05 (and older) handle inverted image (dark > background and light text) without problem, for 4.x version use dark text > on light background. > isn'it the same than : > (thresh, im_bw) = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY | > cv2.THRESH_OTSU) > im_bw = cv2.bitwise_not(im_bw) > > for resizing, I take my picture in full HD, do increasing resolution will > allow tesseract to better OCR ? > > thanks > > > Le samedi 25 juin 2022 à 11:25:50 UTC+2, zdenop a écrit : > >> Why you did not try more relevant hits like inverting and resizing? >> >> Zdenko >> >> >> so 25. 6. 2022 o 10:56 Hervé <[email protected]> napísal(a): >> >>> I tried gray image, black and white, and I use >>> >>> custom_psm = r'--psm 7' >>> >>> didn't try others parameters >>> Le samedi 25 juin 2022 à 10:32:14 UTC+2, zdenop a écrit : >>> >>>> >>>> >>>> so 25. 6. 2022 o 8:15 Hervé <[email protected]> napísal(a): >>>> >>>>> Hi >>>>> I just tried some, without real success >>>>> >>>>> Please be specific: what did you try and what was the result? >>>> >>>> >>>> >>>>> could I learn digits from pictures ? maybe this font is not well >>>>> recognized >>>>> >>>> >>>> Any training is useless if the failure is at the image preprocessing >>>> stage. >>>> >>>> >>>>> thanks >>>>> >>>>> Le vendredi 24 juin 2022 à 17:12:44 UTC+2, zdenop a écrit : >>>>> >>>>>> Did try to implement suggestion from documentation? >>>>>> https://github.com/tesseract-ocr/tessdoc/blob/main/ImproveQuality.md >>>>>> >>>>>> >>>>>> Zdenko >>>>>> >>>>>> >>>>>> pi 24. 6. 2022 o 16:59 Hervé <[email protected]> napísal(a): >>>>>> >>>>>>> Hi, I need some help to make tesseract-OCR recognize digits : can't >>>>>>> achieve to make this work with >>>>>>> >>>>>>> >>>>>>> https://img.super-h.fr/images/2022/06/24/9a03414616bc4c6bd6e4bdb78e9d6783.jpg >>>>>>> >>>>>>> here is my code : >>>>>>> >>>>>>> >>>>>>> >>>>>>> import cv2 >>>>>>> import pytesseract >>>>>>> >>>>>>> pytesseract.pytesseract.tesseract_cmd ="C:\\Program >>>>>>> Files\\Tesseract-OCR\\tesseract.exe" >>>>>>> >>>>>>> def process_image(img): >>>>>>> #cv2.imshow('Img',img) >>>>>>> #cv2.waitKey(0) >>>>>>> >>>>>>> ### passage en niveau de gris >>>>>>> gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) >>>>>>> #cv2.imshow('Img',gray) >>>>>>> #v2.waitKey(0) >>>>>>> >>>>>>> ###analyse de l'image >>>>>>> valeur = pytesseract.image_to_string(gray) >>>>>>> print(valeur) >>>>>>> >>>>>>> ##passage en noir et blanc >>>>>>> (thresh, im_bw) = cv2.threshold(gray, 128, 255, >>>>>>> cv2.THRESH_BINARY | cv2.THRESH_OTSU) >>>>>>> im_bw = cv2.bitwise_not(im_bw) >>>>>>> #cv2.imshow('Img',im_bw) >>>>>>> #cv2.waitKey(0) >>>>>>> # cv2.imwrite('ph.png',im_bw) >>>>>>> print(pytesseract.image_to_string(im_bw)) >>>>>>> >>>>>>> >>>>>>> ###ouverture de l'image >>>>>>> img = cv2.imread('ocr5.png') >>>>>>> # cv2.imshow('Img',imgcoupee) >>>>>>> >>>>>>> >>>>>>> ###on rogne >>>>>>> imgcoupee = img[1056:1517,950:1862] >>>>>>> #img = cv2.imwrite('ocrcoupee.png',imgcoupee) >>>>>>> # cv2.imshow('Img',imgcoupee) >>>>>>> >>>>>>> ### decoupage de la partie correspondant au PH >>>>>>> ph= img[516:625, 616:815] >>>>>>> >>>>>>> #cv2.imwrite('pH.jpg', image_pH) >>>>>>> >>>>>>> ### partie chlore >>>>>>> cl = img[516:625, 882:1056] >>>>>>> >>>>>>> ### partie dÃ:copyright:faut flow >>>>>>> #flow= img[1302:1398,1054:1400] >>>>>>> >>>>>>> ### process >>>>>>> #process_image(imgcoupee) >>>>>>> process_image(ph) >>>>>>> process_image(cl) >>>>>>> #process_image(flow) >>>>>>> >>>>>>> digits seems to be clear enough, but it does'nt work, if someone >>>>>>> could help me ? >>>>>>> >>>>>>> thanks ! >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "tesseract-ocr" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/a05712a5-e6ed-411f-a072-e389ea7095efn%40googlegroups.com >>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/a05712a5-e6ed-411f-a072-e389ea7095efn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> >>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/4ed81a73-0a82-426e-a35e-ba52c5ac71f1n%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/4ed81a73-0a82-426e-a35e-ba52c5ac71f1n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/eb2f2bdd-843d-4f11-83bb-d96e578ad94en%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/eb2f2bdd-843d-4f11-83bb-d96e578ad94en%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/68c4cc25-811d-41dd-b93a-b0df17d9b705n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/68c4cc25-811d-41dd-b93a-b0df17d9b705n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xdPmRxzn8sCvnDA-hYiTxAMVAZTATNziW%2BEbBS3F-bPQ%40mail.gmail.com.

