Sorry I am really noob

When I do : tesseract pH_treshr.png -
I have :
Empty page!!
Empty page!!

How do you achieve to have this image ? and why can't I tesseract it like 
you ? I am on buster with tesseract 5.1

is there a way to discuss ? discord ? 

thanks for your patience and help

Le samedi 25 juin 2022 à 14:34:06 UTC+2, zdenop a écrit :

> Sorry - I mean Rescaling:
>
> Tesseract works best on images which have a DPI of at least 300 dpi, so it 
> may be beneficial to resize images. For more information see the FAQ.
> "Willus Dotkom" made interesting test for Optimal image resolution with 
> suggestion for optimal Height of capital letter in pixels:
> https://groups.google.com/g/tesseract-ocr/c/Wdh_JJwnw94/m/24JHDYQbBQAJ
>
>
> After that, you can get output (but the dot is missing) with the command 
> line: "tesseract pH_treshr.png -"
>
> I was able to get the decimal point separator with the letsgodigital data 
> file 
> https://github.com/arturaugusto/display_ocr/blob/master/letsgodigital/letsgodigital.traineddata
> tesseract pH_treshr.png - -l letsgodigital
>
> Or  have a look at SSD https://github.com/Shreeshrii/tessdata_ssd
>
> Zdenko
>
>
> so 25. 6. 2022 o 12:17 Hervé <[email protected]> napísal(a):
>
>> I am on tesseract 5
>>
>> Inverting images 
>>
>> While tesseract version 3.05 (and older) handle inverted image (dark 
>> background and light text) without problem, for 4.x version use dark text 
>> on light background.
>> isn'it the same than : 
>>     (thresh, im_bw) = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY | 
>> cv2.THRESH_OTSU)
>>     im_bw = cv2.bitwise_not(im_bw)
>>
>> for resizing, I take my picture in full HD, do increasing resolution will 
>> allow tesseract to better OCR ?
>>
>> thanks
>>
>>
>> Le samedi 25 juin 2022 à 11:25:50 UTC+2, zdenop a écrit :
>>
>>> Why you did not try more relevant hits like inverting and resizing?
>>>
>>> Zdenko
>>>
>>>
>>> so 25. 6. 2022 o 10:56 Hervé <[email protected]> napísal(a):
>>>
>>>> I tried gray image, black and white, and I use 
>>>>
>>>>  custom_psm = r'--psm 7'
>>>>
>>>> didn't try others parameters
>>>> Le samedi 25 juin 2022 à 10:32:14 UTC+2, zdenop a écrit :
>>>>
>>>>>
>>>>>
>>>>> so 25. 6. 2022 o 8:15 Hervé <[email protected]> napísal(a):
>>>>>
>>>>>> Hi
>>>>>> I just tried some, without real success
>>>>>>
>>>>>> Please be specific: what did you try and what was the result?
>>>>>
>>>>>  
>>>>>
>>>>>> could I learn digits from pictures ? maybe this font is not well 
>>>>>> recognized
>>>>>>
>>>>>
>>>>> Any training is useless if the failure is at the image preprocessing 
>>>>> stage.
>>>>>
>>>>>
>>>>>> thanks
>>>>>>
>>>>>> Le vendredi 24 juin 2022 à 17:12:44 UTC+2, zdenop a écrit :
>>>>>>
>>>>>>> Did try to implement suggestion from documentation?
>>>>>>> https://github.com/tesseract-ocr/tessdoc/blob/main/ImproveQuality.md
>>>>>>>
>>>>>>>
>>>>>>> Zdenko
>>>>>>>
>>>>>>>
>>>>>>> pi 24. 6. 2022 o 16:59 Hervé <[email protected]> napísal(a):
>>>>>>>
>>>>>>>> Hi, I need some help to make tesseract-OCR recognize digits : can't 
>>>>>>>> achieve to make this work with
>>>>>>>>
>>>>>>>>  
>>>>>>>> https://img.super-h.fr/images/2022/06/24/9a03414616bc4c6bd6e4bdb78e9d6783.jpg
>>>>>>>>  
>>>>>>>>
>>>>>>>> here is my code : 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> import cv2
>>>>>>>> import pytesseract
>>>>>>>>
>>>>>>>> pytesseract.pytesseract.tesseract_cmd ="C:\\Program 
>>>>>>>> Files\\Tesseract-OCR\\tesseract.exe"
>>>>>>>>
>>>>>>>> def process_image(img):
>>>>>>>>     #cv2.imshow('Img',img)
>>>>>>>>     #cv2.waitKey(0)
>>>>>>>>
>>>>>>>>     ### passage en niveau de gris
>>>>>>>>     gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
>>>>>>>>     #cv2.imshow('Img',gray)
>>>>>>>>     #v2.waitKey(0)
>>>>>>>>
>>>>>>>>     ###analyse de l'image
>>>>>>>>     valeur = pytesseract.image_to_string(gray)
>>>>>>>>     print(valeur)
>>>>>>>>
>>>>>>>>     ##passage en noir et blanc
>>>>>>>>     (thresh, im_bw) = cv2.threshold(gray, 128, 255, 
>>>>>>>> cv2.THRESH_BINARY | cv2.THRESH_OTSU)
>>>>>>>>     im_bw = cv2.bitwise_not(im_bw)
>>>>>>>>     #cv2.imshow('Img',im_bw)
>>>>>>>>     #cv2.waitKey(0)
>>>>>>>>     # cv2.imwrite('ph.png',im_bw)
>>>>>>>>     print(pytesseract.image_to_string(im_bw))
>>>>>>>>
>>>>>>>>
>>>>>>>> ###ouverture de l'image
>>>>>>>> img = cv2.imread('ocr5.png')
>>>>>>>> # cv2.imshow('Img',imgcoupee)
>>>>>>>>
>>>>>>>>
>>>>>>>> ###on rogne
>>>>>>>> imgcoupee = img[1056:1517,950:1862]
>>>>>>>> #img = cv2.imwrite('ocrcoupee.png',imgcoupee)
>>>>>>>> # cv2.imshow('Img',imgcoupee)
>>>>>>>>
>>>>>>>> ### decoupage de la partie correspondant au PH
>>>>>>>> ph= img[516:625, 616:815]
>>>>>>>>
>>>>>>>> #cv2.imwrite('pH.jpg', image_pH)
>>>>>>>>
>>>>>>>> ### partie chlore
>>>>>>>> cl = img[516:625, 882:1056]
>>>>>>>>
>>>>>>>> ### partie dÃ:copyright:faut flow
>>>>>>>> #flow= img[1302:1398,1054:1400]
>>>>>>>>
>>>>>>>> ### process
>>>>>>>> #process_image(imgcoupee)
>>>>>>>> process_image(ph)
>>>>>>>> process_image(cl)
>>>>>>>> #process_image(flow)
>>>>>>>>
>>>>>>>> digits seems to be clear enough, but it does'nt work, if someone 
>>>>>>>> could help me ?
>>>>>>>>
>>>>>>>> thanks !
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "tesseract-ocr" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/a05712a5-e6ed-411f-a072-e389ea7095efn%40googlegroups.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/a05712a5-e6ed-411f-a072-e389ea7095efn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to [email protected].
>>>>>>
>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/4ed81a73-0a82-426e-a35e-ba52c5ac71f1n%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/4ed81a73-0a82-426e-a35e-ba52c5ac71f1n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>>
>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/eb2f2bdd-843d-4f11-83bb-d96e578ad94en%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/eb2f2bdd-843d-4f11-83bb-d96e578ad94en%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/68c4cc25-811d-41dd-b93a-b0df17d9b705n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/68c4cc25-811d-41dd-b93a-b0df17d9b705n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/f76dbe5d-d75d-4ef8-90c3-d36ae3898194n%40googlegroups.com.

Reply via email to