1. You got the result for the image you provided. 2. I suggest you to use other oem 3. I know that invoice digitalizator use different parameters for parsing numbers.
Zdenko st 21. 4. 2021 o 17:45 Kumar Rajwani <[email protected]> napísal(a): > Hi Zdenop, As i said i know psm 6 working better in number but it not able > to get all text in image. where psm 11 does better. So this the reason i > want to with psm 11 but i am getting wrong amount that's the only problem i > am facing with psm 11. So can you tell me how can i achive same result as > you in psm 11. > Thanks > > On Wednesday, April 21, 2021 at 8:34:20 PM UTC+5:30 zdenop wrote: > >> Try to use better config parameters. e.g: >> >> $ tesseract download.png - --psm 6 --oem 0 >> will produce: >> $ 250,941.00 >> $ -75,282.00 >> $ 175,659.00 >> $ -15,072 00 >> $ 2,860.00 >> $ 0.00 >> $ 163,447.00 >> >> legacy engine could be better for numbers >> >> Zdenko >> >> >> st 21. 4. 2021 o 14:10 Kumar Rajwani <[email protected]> napísal(a): >> >>> Hey, >>> I am using tesseract to identify amounts in my forms. You can look below >>> image for sample. i am getting perfect amount with decimal in psm 6. >>> but when i use psm 11 i am getting follwing output. I have to use psm 11 >>> as it identify more text with compare to psm 6 in my images. >>> 250,941 >>> 00 >>> 00 >>> -75,282 >>> 175,659 >>> 00 >>> -15,072 >>> 00 >>> 2,860 >>> 00 >>> 00 >>> 163,447 >>> 00 >>> The code i am using. >>> print(pytesseract.image_to_string(image.crop((2000,1570,2500,2000)), >>> lang="eng", >>> >>> config = '-c tessedit_do_invert=0 --psm >>> 11').replace("\n\n","\n")) >>> >>> I want to ask if there is any changes i can do to get decimal point with >>> psm 11. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/4d793afb-b554-4322-83ef-4ff94accc85en%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/4d793afb-b554-4322-83ef-4ff94accc85en%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/aaede6a0-c304-45a7-badd-b242091d821bn%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/aaede6a0-c304-45a7-badd-b242091d821bn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yJn_s41YkO15gauTdVjJS%2BQJr9fVC7%3DNFfQM15q4V41Q%40mail.gmail.com.

