I have built a template that can cut this into 6 pieces. 

What I need to do with these is to put /n line characters at the end of the 
lines or be able to get the Y coordinates to see when those change and the 
degree of change. I am not able to find anything useful on how to 
accomplish this. 

Thanks,
Daron

<https://lh3.googleusercontent.com/-VMAhirxkL4o/Wwv24_q8DfI/AAAAAAAAAwU/h63d0VQ3EzEnUjq76CnW2hc0o79xQ88DwCLcBGAs/s1600/6slices.png>


On Sunday, May 27, 2018 at 11:02:13 PM UTC-5, Daron Goode wrote:
>
> Hello,
>
> I am new to Tesseract and could use some guidance on how a versed person 
> would tackle this issue.  I have a php website where I can get the data out 
> of a pdf without any issues but the order of the data that I am pulling is 
> a mess.  The issue is that the return is only one long sting without any 
> return characters or other way to break it down into parts  I was going to 
> slice the pdf into several chunks and run each one though OCR at a time but 
> I find that Tesseract has the power to do what I need it to do. Also with 
> the 1000s of times the user will be uploading a new pdf it might not line 
> up exactly the way I need it to. 
>
> My end goal is to be able to update all these values to my database in the 
> order they are related.  For the 4th generation that would be 31 different 
> areas to scoop up the data I need.  If these are in order with an X 
> coordinate I can always use that and work my Y values down.  
>
> Even if all I had to work with is a /n character for each line I might be 
> able to make that work.  
>
> On the 4th generation Pedigree I tried to cut the last entire 4th 
> generation out.  If I go that route that would only be 6 crops I need to 
> make on this (1 for the dog, two for each of those parents, and then each 
> generation.  My users will have 3 or 4 generation pedigrees.  
>
> Any advice would be greatly appreciated. 
> Thanks
> Daron
>
>
> <https://lh3.googleusercontent.com/-EUDy1RXhwNI/WwtIj87cJpI/AAAAAAAAAv4/YxrTRX4IDUU6fx5GlJTweEhUff6OgXzCgCLcBGAs/s1600/test4.png>
>
>
> <https://lh3.googleusercontent.com/-Z4Jqh3ibhC0/WwtI760Pl_I/AAAAAAAAAwA/mgbcQyCfk5smwKyzzfhIaNutRCplfvlNACLcBGAs/s1600/test2.jpg>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/715e1bb6-b7d2-4ce0-8a84-f583bdaf95ce%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to