See https://github.com/jsoma/tesseract-uzn

Basically uzn files predefine zones on the page and then each of those
would be recognized

Search in the forum for past posts

On Fri, Oct 18, 2019, 11:29 Rahul Dochak <[email protected]> wrote:

> Can you elaborate the process . If that is not much of an ask.
> Rahul
>
> On Friday, October 18, 2019 at 11:16:54 AM UTC+5:30, shree wrote:
>>
>> You can try with uzn files. See https://jsoma.github.io/kull/#/
>>
>> On Fri, Oct 18, 2019 at 11:03 AM Rahul Dochak <[email protected]>
>> wrote:
>>
>>> Hi All,
>>>
>>>     I have a task and I could see a way to approach this but i do not
>>> know how to ,what i am trying to do is this:
>>> I want to make a form recogniser and then extract text from the fields
>>> inside the forms,the form are in the form of scanned pdf's and i do not
>>> know the forms or the fields beforehand only knows about the form name .
>>> I want to scan the pdf and convert it to text and then search for the
>>> form name and check if I have a predefined template for that form type if
>>> not then I have to somehow get the location of all the fields as I do not
>>> have the required fields for a form type,and make a template for future use
>>> with the same form type and extract the data of the fields to json. I could
>>> not find a way to make a template on the go for a new form type . Guidance
>>> in to the right direction will be helpful.
>>>
>>> Thanks in advance.
>>> Rahul.
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/6edb4f1a-c44c-4f9c-b929-f3079b223eb6%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/6edb4f1a-c44c-4f9c-b929-f3079b223eb6%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> --
>>
>> ____________________________________________________________
>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/dfdd6f29-266b-47fb-8cb0-1fce3da7116e%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/dfdd6f29-266b-47fb-8cb0-1fce3da7116e%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVY3mKJX4r_FLU9Md9sOHAqQ8%2BRH54D0ePsy8LdQPnwgA%40mail.gmail.com.

Reply via email to