Use opencv SIFT (or others) to align the picture with your template.
http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_feature2d/py_feature_homography/py_feature_homography.html#feature-homography https://docs.opencv.org/3.3.0/dc/dc3/tutorial_py_matcher.html That will make everything much much easier. Bye Lorenzo 2018-05-28 14:34 GMT+02:00 Daron Goode <[email protected]>: > I have built a template that can cut this into 6 pieces. > > What I need to do with these is to put /n line characters at the end of > the lines or be able to get the Y coordinates to see when those change and > the degree of change. I am not able to find anything useful on how to > accomplish this. > > Thanks, > Daron > > > <https://lh3.googleusercontent.com/-VMAhirxkL4o/Wwv24_q8DfI/AAAAAAAAAwU/h63d0VQ3EzEnUjq76CnW2hc0o79xQ88DwCLcBGAs/s1600/6slices.png> > > > On Sunday, May 27, 2018 at 11:02:13 PM UTC-5, Daron Goode wrote: >> >> Hello, >> >> I am new to Tesseract and could use some guidance on how a versed person >> would tackle this issue. I have a php website where I can get the data out >> of a pdf without any issues but the order of the data that I am pulling is >> a mess. The issue is that the return is only one long sting without any >> return characters or other way to break it down into parts I was going to >> slice the pdf into several chunks and run each one though OCR at a time but >> I find that Tesseract has the power to do what I need it to do. Also with >> the 1000s of times the user will be uploading a new pdf it might not line >> up exactly the way I need it to. >> >> My end goal is to be able to update all these values to my database in >> the order they are related. For the 4th generation that would be 31 >> different areas to scoop up the data I need. If these are in order with an >> X coordinate I can always use that and work my Y values down. >> >> Even if all I had to work with is a /n character for each line I might be >> able to make that work. >> >> On the 4th generation Pedigree I tried to cut the last entire 4th >> generation out. If I go that route that would only be 6 crops I need to >> make on this (1 for the dog, two for each of those parents, and then each >> generation. My users will have 3 or 4 generation pedigrees. >> >> Any advice would be greatly appreciated. >> Thanks >> Daron >> >> >> <https://lh3.googleusercontent.com/-EUDy1RXhwNI/WwtIj87cJpI/AAAAAAAAAv4/YxrTRX4IDUU6fx5GlJTweEhUff6OgXzCgCLcBGAs/s1600/test4.png> >> >> >> <https://lh3.googleusercontent.com/-Z4Jqh3ibhC0/WwtI760Pl_I/AAAAAAAAAwA/mgbcQyCfk5smwKyzzfhIaNutRCplfvlNACLcBGAs/s1600/test2.jpg> >> >> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/tesseract-ocr/715e1bb6-b7d2-4ce0-8a84-f583bdaf95ce% > 40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/715e1bb6-b7d2-4ce0-8a84-f583bdaf95ce%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLyJfC%3DN8P7y-h_a19%3DBV2Pg_Hk0_yey8QzDTz8QGCXicw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

