There doesn't seem to be one... I guess I can try StackOverflow. On Tue, May 23, 2017 at 11:54 AM, Andreas Lehmkühler <[email protected]> wrote:
> > Gilad Denneboom <[email protected]> hat am 22. Mai 2017 um > 22:07 geschrieben: > > > > > > Hi all, > > > > So I'm trying to identify hair-lines in my PDFs. I came across tabula, > > which seems to be able to do it, but I can't get it to quite work with my > > files in the way I need it to, so any help is greatly appreciated! > > > > Here's what I've been doing so far: I used the Ruling object from tabula > to > > extract both the horizontal and vertical rules from a stripped version of > > the PDF page (ie, after removing all the text in it). > > I'm getting results but now I want to relate them back to the original > PDF > > page, and that's proving difficult. If I add a text field using the > > coordinates of the Ruling objects they are way off then where I would > > expect them to be. I think it has to do with the DPI setting used to > > convert the PDF page to an image, which is necessary for the rulings > > extraction. > > So my question is: How can I take these Ruling objects and convert them > > back to the original coordinates of the PDF? > > I would also like to be able to only identify lines of a certain width > and > > height, but if I get the rectangles to work correctly I think I can do > that > > in post-processing. > Sounds like a question for the tabulapdf community ... > > Andreas > > > > Thanks in advance! > > Gilad >

