Newbie question about parsing PDFs

David Goodenough Sun, 25 Sep 2016 03:25:16 -0700

I need to take a PDF document and extract each item of text with its
position on the page.  PDFBox looks to be a good tool to use, but the
examples are mainly to do with building PDFs rather than parsing them
and the API is very rich (for which read large).


Does anyone have any code they would be prepared to share that does
this kind of parsing, or some pointers as to which classes I should
be looking at?

Thank you

David

Newbie question about parsing PDFs

Reply via email to