Hey can you please suggest something how can i achive better results. On Friday, April 23, 2021 at 9:10:51 PM UTC+5:30 Kumar Rajwani wrote:
> hey zdenop can you please see > tesseract /content/img.png out2 --psm 11 -c textord_min_linesize=3 this > command it's working for me. Please tell me if i change this parameter can > i get better results or it will be mess something else. > Can you also tell me some parameter that will work for me? > Thanks > > On Friday, April 23, 2021 at 1:46:44 PM UTC+5:30 Kumar Rajwani wrote: > >> Hi , can you please look into this image so we can get more clear idea >> why i want to go with psm 11 . >> If you try this image with psm 6 then >> It will miss the first line and date will be wrong also the numbers .40 >> will converted into AQ but same image with psm 11 can give better results. >> Can you suggest something that would be great? >> >> On Friday, April 23, 2021 at 1:17:20 PM UTC+5:30 Kumar Rajwani wrote: >> >>> Can you tell is there any way we can make psm 11 parameter to recognize >>> numbers well. It will be great than. >>> >>> On Thursday, April 22, 2021 at 12:11:59 PM UTC+5:30 Kumar Rajwani wrote: >>> >>>> Hey zdenop that was the portion of full image which was not detected >>>> properly by tesseract. In full image there is lot's of information that's >>>> the reason i didn't share. All information are important so psm 11 is >>>> working great there. If i am using psm 6 then it will miss some lines so i >>>> can't use that. >>>> i have tried the psm 11 with oem 0,1,2,3 but none of them work as i >>>> want. >>>> For me the best choice is psm 11 but number are issue can you advise >>>> something on this? >>>> Thanks >>>> >>>> On Wednesday, April 21, 2021 at 10:35:09 PM UTC+5:30 zdenop wrote: >>>> >>>>> >>>>> 1. You got the result for the image you provided. >>>>> 2. I suggest you to use other oem >>>>> 3. I know that invoice digitalizator use different parameters for >>>>> parsing numbers. >>>>> >>>>> >>>>> Zdenko >>>>> >>>>> >>>>> st 21. 4. 2021 o 17:45 Kumar Rajwani <[email protected]> >>>>> napísal(a): >>>>> >>>>>> Hi Zdenop, As i said i know psm 6 working better in number but it not >>>>>> able to get all text in image. where psm 11 does better. So this the >>>>>> reason >>>>>> i want to with psm 11 but i am getting wrong amount that's the only >>>>>> problem >>>>>> i am facing with psm 11. So can you tell me how can i achive same result >>>>>> as >>>>>> you in psm 11. >>>>>> Thanks >>>>>> >>>>>> On Wednesday, April 21, 2021 at 8:34:20 PM UTC+5:30 zdenop wrote: >>>>>> >>>>>>> Try to use better config parameters. e.g: >>>>>>> >>>>>>> $ tesseract download.png - --psm 6 --oem 0 >>>>>>> will produce: >>>>>>> $ 250,941.00 >>>>>>> $ -75,282.00 >>>>>>> $ 175,659.00 >>>>>>> $ -15,072 00 >>>>>>> $ 2,860.00 >>>>>>> $ 0.00 >>>>>>> $ 163,447.00 >>>>>>> >>>>>>> legacy engine could be better for numbers >>>>>>> >>>>>>> Zdenko >>>>>>> >>>>>>> >>>>>>> st 21. 4. 2021 o 14:10 Kumar Rajwani <[email protected]> >>>>>>> napísal(a): >>>>>>> >>>>>>>> Hey, >>>>>>>> I am using tesseract to identify amounts in my forms. You can look >>>>>>>> below image for sample. i am getting perfect amount with decimal in >>>>>>>> psm 6. >>>>>>>> but when i use psm 11 i am getting follwing output. I have to use >>>>>>>> psm 11 as it identify more text with compare to psm 6 in my images. >>>>>>>> 250,941 >>>>>>>> 00 >>>>>>>> 00 >>>>>>>> -75,282 >>>>>>>> 175,659 >>>>>>>> 00 >>>>>>>> -15,072 >>>>>>>> 00 >>>>>>>> 2,860 >>>>>>>> 00 >>>>>>>> 00 >>>>>>>> 163,447 >>>>>>>> 00 >>>>>>>> The code i am using. >>>>>>>> print(pytesseract.image_to_string(image.crop((2000,1570,2500,2000)), >>>>>>>> lang="eng", >>>>>>>> >>>>>>>> config = '-c tessedit_do_invert=0 >>>>>>>> --psm 11').replace("\n\n","\n")) >>>>>>>> >>>>>>>> I want to ask if there is any changes i can do to get decimal point >>>>>>>> with psm 11. >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "tesseract-ocr" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/4d793afb-b554-4322-83ef-4ff94accc85en%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/4d793afb-b554-4322-83ef-4ff94accc85en%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> >>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/aaede6a0-c304-45a7-badd-b242091d821bn%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/aaede6a0-c304-45a7-badd-b242091d821bn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/6efc7240-a23a-41b5-b9ce-5b1d327293c8n%40googlegroups.com.

