Thanks for the advice, Daniel. This helps to point me in the right direction. However, I'm still a bit confused, so please bear with me.
Looking at the code to PageDrawer#drawPage, I see that it is iterating through the PDAnnotations that belong to the PDPage object. This leads me to two questions: 1) If I already have the PDPage object, then isn't it unnecessary to call PageDrawer i.e. can't I just read the PDAnnotations from wherever I'd be calling #drawPage? 2) I'm still not sure how to interpret these objects to obtain the vector graphics information. I'm guessing that it's a matter of iterating through the PDAnnotations and looking for vector graphics commands. But is this correct or am I missing something? Sorry for the total newbieness, and thanks for any help you (or anyone else) can provide. Ray Weidner On Mon, Aug 22, 2011 at 6:41 PM, Daniel Wilson < [email protected]> wrote: > The big one to override, IMO, is drawPage. > > For a different application, I also override: > > - processTextPosition > - fillPath > - setStroke > - getStroke > - strokePath -- this might be key for your application ... > - drawImage > > hope this helps. > > Daniel > > On Mon, Aug 22, 2011 at 5:13 PM, Ray Weidner < > [email protected]> wrote: > > > Hi, > > > > I'm currently using PDFBox to provide me with text/location information > in > > order to heuristically detect table structures in a document. One way > I'd > > like to enhance this is by making use of actual grid lines, when they are > > present. To do this, I believe I need to extract the vector graphics > > commands from the document. > > > > I found one helpful post on this matter in the mail archives ( > > > http://mail-archives.apache.org/mod_mbox/pdfbox-users/200902.mbox/browser > > ). > > The recommendation was simply to override PageDrawer in order to > intercept > > graphics commands. This sounds like a good idea, but I'm totally unsure > of > > how to interpret the calls that I should be intercepting. Can anyone > give > > me some advice here, or point me to a document that should make things > > clearer? > > > > Please be aware that I am both a newbie to PDFBox as well as the PDF > > document standard, so don't assume too much about what I already know. > > Thanks in advance. > > > > Ray Weidner > > >

