Hi all, So I'm trying to identify hair-lines in my PDFs. I came across tabula, which seems to be able to do it, but I can't get it to quite work with my files in the way I need it to, so any help is greatly appreciated!
Here's what I've been doing so far: I used the Ruling object from tabula to extract both the horizontal and vertical rules from a stripped version of the PDF page (ie, after removing all the text in it). I'm getting results but now I want to relate them back to the original PDF page, and that's proving difficult. If I add a text field using the coordinates of the Ruling objects they are way off then where I would expect them to be. I think it has to do with the DPI setting used to convert the PDF page to an image, which is necessary for the rulings extraction. So my question is: How can I take these Ruling objects and convert them back to the original coordinates of the PDF? I would also like to be able to only identify lines of a certain width and height, but if I get the rectangles to work correctly I think I can do that in post-processing. Thanks in advance! Gilad

