Use opencv SIFT (or others) to align the picture with your template.

http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_feature2d/py_feature_homography/py_feature_homography.html#feature-homography

https://docs.opencv.org/3.3.0/dc/dc3/tutorial_py_matcher.html


That will make everything much much easier.


Bye

Lorenzo


2018-05-28 14:34 GMT+02:00 Daron Goode <[email protected]>:

> I have built a template that can cut this into 6 pieces.
>
> What I need to do with these is to put /n line characters at the end of
> the lines or be able to get the Y coordinates to see when those change and
> the degree of change. I am not able to find anything useful on how to
> accomplish this.
>
> Thanks,
> Daron
>
>
> <https://lh3.googleusercontent.com/-VMAhirxkL4o/Wwv24_q8DfI/AAAAAAAAAwU/h63d0VQ3EzEnUjq76CnW2hc0o79xQ88DwCLcBGAs/s1600/6slices.png>
>
>
> On Sunday, May 27, 2018 at 11:02:13 PM UTC-5, Daron Goode wrote:
>>
>> Hello,
>>
>> I am new to Tesseract and could use some guidance on how a versed person
>> would tackle this issue.  I have a php website where I can get the data out
>> of a pdf without any issues but the order of the data that I am pulling is
>> a mess.  The issue is that the return is only one long sting without any
>> return characters or other way to break it down into parts  I was going to
>> slice the pdf into several chunks and run each one though OCR at a time but
>> I find that Tesseract has the power to do what I need it to do. Also with
>> the 1000s of times the user will be uploading a new pdf it might not line
>> up exactly the way I need it to.
>>
>> My end goal is to be able to update all these values to my database in
>> the order they are related.  For the 4th generation that would be 31
>> different areas to scoop up the data I need.  If these are in order with an
>> X coordinate I can always use that and work my Y values down.
>>
>> Even if all I had to work with is a /n character for each line I might be
>> able to make that work.
>>
>> On the 4th generation Pedigree I tried to cut the last entire 4th
>> generation out.  If I go that route that would only be 6 crops I need to
>> make on this (1 for the dog, two for each of those parents, and then each
>> generation.  My users will have 3 or 4 generation pedigrees.
>>
>> Any advice would be greatly appreciated.
>> Thanks
>> Daron
>>
>>
>> <https://lh3.googleusercontent.com/-EUDy1RXhwNI/WwtIj87cJpI/AAAAAAAAAv4/YxrTRX4IDUU6fx5GlJTweEhUff6OgXzCgCLcBGAs/s1600/test4.png>
>>
>>
>> <https://lh3.googleusercontent.com/-Z4Jqh3ibhC0/WwtI760Pl_I/AAAAAAAAAwA/mgbcQyCfk5smwKyzzfhIaNutRCplfvlNACLcBGAs/s1600/test2.jpg>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/715e1bb6-b7d2-4ce0-8a84-f583bdaf95ce%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/715e1bb6-b7d2-4ce0-8a84-f583bdaf95ce%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLyJfC%3DN8P7y-h_a19%3DBV2Pg_Hk0_yey8QzDTz8QGCXicw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to