I need to take a PDF document and extract each item of text with its position on the page. PDFBox looks to be a good tool to use, but the examples are mainly to do with building PDFs rather than parsing them and the API is very rich (for which read large).
Does anyone have any code they would be prepared to share that does this kind of parsing, or some pointers as to which classes I should be looking at? Thank you David

