See https://github.com/jsoma/tesseract-uzn
Basically uzn files predefine zones on the page and then each of those would be recognized Search in the forum for past posts On Fri, Oct 18, 2019, 11:29 Rahul Dochak <[email protected]> wrote: > Can you elaborate the process . If that is not much of an ask. > Rahul > > On Friday, October 18, 2019 at 11:16:54 AM UTC+5:30, shree wrote: >> >> You can try with uzn files. See https://jsoma.github.io/kull/#/ >> >> On Fri, Oct 18, 2019 at 11:03 AM Rahul Dochak <[email protected]> >> wrote: >> >>> Hi All, >>> >>> I have a task and I could see a way to approach this but i do not >>> know how to ,what i am trying to do is this: >>> I want to make a form recogniser and then extract text from the fields >>> inside the forms,the form are in the form of scanned pdf's and i do not >>> know the forms or the fields beforehand only knows about the form name . >>> I want to scan the pdf and convert it to text and then search for the >>> form name and check if I have a predefined template for that form type if >>> not then I have to somehow get the location of all the fields as I do not >>> have the required fields for a form type,and make a template for future use >>> with the same form type and extract the data of the fields to json. I could >>> not find a way to make a template on the go for a new form type . Guidance >>> in to the right direction will be helpful. >>> >>> Thanks in advance. >>> Rahul. >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/tesseract-ocr/6edb4f1a-c44c-4f9c-b929-f3079b223eb6%40googlegroups.com >>> <https://groups.google.com/d/msgid/tesseract-ocr/6edb4f1a-c44c-4f9c-b929-f3079b223eb6%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> >> >> -- >> >> ____________________________________________________________ >> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/dfdd6f29-266b-47fb-8cb0-1fce3da7116e%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/dfdd6f29-266b-47fb-8cb0-1fce3da7116e%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVY3mKJX4r_FLU9Md9sOHAqQ8%2BRH54D0ePsy8LdQPnwgA%40mail.gmail.com.

