> Gilad Denneboom <[email protected]> hat am 22. Mai 2017 um 22:07 > geschrieben: > > > Hi all, > > So I'm trying to identify hair-lines in my PDFs. I came across tabula, > which seems to be able to do it, but I can't get it to quite work with my > files in the way I need it to, so any help is greatly appreciated! > > Here's what I've been doing so far: I used the Ruling object from tabula to > extract both the horizontal and vertical rules from a stripped version of > the PDF page (ie, after removing all the text in it). > I'm getting results but now I want to relate them back to the original PDF > page, and that's proving difficult. If I add a text field using the > coordinates of the Ruling objects they are way off then where I would > expect them to be. I think it has to do with the DPI setting used to > convert the PDF page to an image, which is necessary for the rulings > extraction. > So my question is: How can I take these Ruling objects and convert them > back to the original coordinates of the PDF? > I would also like to be able to only identify lines of a certain width and > height, but if I get the rectangles to work correctly I think I can do that > in post-processing. Sounds like a question for the tabulapdf community ...
Andreas > > Thanks in advance! > Gilad --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

