Thanks for the advice, Daniel.  This helps to point me in the right
direction.  However, I'm still a bit confused, so please bear with me.

Looking at the code to PageDrawer#drawPage, I see that it is iterating
through the PDAnnotations that belong to the PDPage object.  This leads me
to two questions:

1) If I already have the PDPage object, then isn't it unnecessary to call
PageDrawer i.e. can't I just read the PDAnnotations from wherever I'd be
calling #drawPage?

2) I'm still not sure how to interpret these objects to obtain the vector
graphics information.  I'm guessing that it's a matter of iterating through
the PDAnnotations and looking for vector graphics commands.  But is this
correct or am I missing something?

Sorry for the total newbieness, and thanks for any help you (or anyone else)
can provide.

Ray Weidner



On Mon, Aug 22, 2011 at 6:41 PM, Daniel Wilson <
[email protected]> wrote:

> The big one to override, IMO, is drawPage.
>
> For a different application, I also override:
>
>   - processTextPosition
>   - fillPath
>   - setStroke
>   - getStroke
>   - strokePath -- this might be key for your application ...
>   - drawImage
>
> hope this helps.
>
> Daniel
>
> On Mon, Aug 22, 2011 at 5:13 PM, Ray Weidner <
> [email protected]> wrote:
>
> > Hi,
> >
> > I'm currently using PDFBox to provide me with text/location information
> in
> > order to heuristically detect table structures in a document.  One way
> I'd
> > like to enhance this is by making use of actual grid lines, when they are
> > present.  To do this, I believe I need to extract the vector graphics
> > commands from the document.
> >
> > I found one helpful post on this matter in the mail archives (
> >
> http://mail-archives.apache.org/mod_mbox/pdfbox-users/200902.mbox/browser
> > ).
> > The recommendation was simply to override PageDrawer in order to
> intercept
> > graphics commands.  This sounds like a good idea, but I'm totally unsure
> of
> > how to interpret the calls that I should be intercepting.  Can anyone
> give
> > me some advice here, or point me to a document that should make things
> > clearer?
> >
> > Please be aware that I am both a newbie to PDFBox as well as the PDF
> > document standard, so don't assume too much about what I already know.
> > Thanks in advance.
> >
> > Ray Weidner
> >
>

Reply via email to