Can you tell is there any way we can make psm 11 parameter to recognize numbers well. It will be great than.
On Thursday, April 22, 2021 at 12:11:59 PM UTC+5:30 Kumar Rajwani wrote: > Hey zdenop that was the portion of full image which was not detected > properly by tesseract. In full image there is lot's of information that's > the reason i didn't share. All information are important so psm 11 is > working great there. If i am using psm 6 then it will miss some lines so i > can't use that. > i have tried the psm 11 with oem 0,1,2,3 but none of them work as i want. > For me the best choice is psm 11 but number are issue can you advise > something on this? > Thanks > > On Wednesday, April 21, 2021 at 10:35:09 PM UTC+5:30 zdenop wrote: > >> >> 1. You got the result for the image you provided. >> 2. I suggest you to use other oem >> 3. I know that invoice digitalizator use different parameters for >> parsing numbers. >> >> >> Zdenko >> >> >> st 21. 4. 2021 o 17:45 Kumar Rajwani <[email protected]> napísal(a): >> >>> Hi Zdenop, As i said i know psm 6 working better in number but it not >>> able to get all text in image. where psm 11 does better. So this the reason >>> i want to with psm 11 but i am getting wrong amount that's the only problem >>> i am facing with psm 11. So can you tell me how can i achive same result as >>> you in psm 11. >>> Thanks >>> >>> On Wednesday, April 21, 2021 at 8:34:20 PM UTC+5:30 zdenop wrote: >>> >>>> Try to use better config parameters. e.g: >>>> >>>> $ tesseract download.png - --psm 6 --oem 0 >>>> will produce: >>>> $ 250,941.00 >>>> $ -75,282.00 >>>> $ 175,659.00 >>>> $ -15,072 00 >>>> $ 2,860.00 >>>> $ 0.00 >>>> $ 163,447.00 >>>> >>>> legacy engine could be better for numbers >>>> >>>> Zdenko >>>> >>>> >>>> st 21. 4. 2021 o 14:10 Kumar Rajwani <[email protected]> >>>> napísal(a): >>>> >>>>> Hey, >>>>> I am using tesseract to identify amounts in my forms. You can look >>>>> below image for sample. i am getting perfect amount with decimal in psm 6. >>>>> but when i use psm 11 i am getting follwing output. I have to use psm >>>>> 11 as it identify more text with compare to psm 6 in my images. >>>>> 250,941 >>>>> 00 >>>>> 00 >>>>> -75,282 >>>>> 175,659 >>>>> 00 >>>>> -15,072 >>>>> 00 >>>>> 2,860 >>>>> 00 >>>>> 00 >>>>> 163,447 >>>>> 00 >>>>> The code i am using. >>>>> print(pytesseract.image_to_string(image.crop((2000,1570,2500,2000)), >>>>> lang="eng", >>>>> >>>>> config = '-c tessedit_do_invert=0 --psm >>>>> 11').replace("\n\n","\n")) >>>>> >>>>> I want to ask if there is any changes i can do to get decimal point >>>>> with psm 11. >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/tesseract-ocr/4d793afb-b554-4322-83ef-4ff94accc85en%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/tesseract-ocr/4d793afb-b554-4322-83ef-4ff94accc85en%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/aaede6a0-c304-45a7-badd-b242091d821bn%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/tesseract-ocr/aaede6a0-c304-45a7-badd-b242091d821bn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/48e01af3-79c1-497b-b70e-6bfe557f9b63n%40googlegroups.com.

