I have built a template that can cut this into 6 pieces. What I need to do with these is to put /n line characters at the end of the lines or be able to get the Y coordinates to see when those change and the degree of change. I am not able to find anything useful on how to accomplish this.
Thanks, Daron <https://lh3.googleusercontent.com/-VMAhirxkL4o/Wwv24_q8DfI/AAAAAAAAAwU/h63d0VQ3EzEnUjq76CnW2hc0o79xQ88DwCLcBGAs/s1600/6slices.png> On Sunday, May 27, 2018 at 11:02:13 PM UTC-5, Daron Goode wrote: > > Hello, > > I am new to Tesseract and could use some guidance on how a versed person > would tackle this issue. I have a php website where I can get the data out > of a pdf without any issues but the order of the data that I am pulling is > a mess. The issue is that the return is only one long sting without any > return characters or other way to break it down into parts I was going to > slice the pdf into several chunks and run each one though OCR at a time but > I find that Tesseract has the power to do what I need it to do. Also with > the 1000s of times the user will be uploading a new pdf it might not line > up exactly the way I need it to. > > My end goal is to be able to update all these values to my database in the > order they are related. For the 4th generation that would be 31 different > areas to scoop up the data I need. If these are in order with an X > coordinate I can always use that and work my Y values down. > > Even if all I had to work with is a /n character for each line I might be > able to make that work. > > On the 4th generation Pedigree I tried to cut the last entire 4th > generation out. If I go that route that would only be 6 crops I need to > make on this (1 for the dog, two for each of those parents, and then each > generation. My users will have 3 or 4 generation pedigrees. > > Any advice would be greatly appreciated. > Thanks > Daron > > > <https://lh3.googleusercontent.com/-EUDy1RXhwNI/WwtIj87cJpI/AAAAAAAAAv4/YxrTRX4IDUU6fx5GlJTweEhUff6OgXzCgCLcBGAs/s1600/test4.png> > > > <https://lh3.googleusercontent.com/-Z4Jqh3ibhC0/WwtI760Pl_I/AAAAAAAAAwA/mgbcQyCfk5smwKyzzfhIaNutRCplfvlNACLcBGAs/s1600/test2.jpg> > > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/715e1bb6-b7d2-4ce0-8a84-f583bdaf95ce%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

