(a) yes (b) yes very basic example code: StringWriter out = new StringWriter(); PDDocument doc = PDDocument.load(file); nbPages = doc.getNumberOfPages(); PDFTextStripper stripper = new PDFTextStripper(); stripper.setStartPage(1); stripper.setEndPage(1); stripper.writeText(doc, out); txt = out.toString().trim(); out.close(); doc.close();
Please check the sample code included in pdfbox for better examples Best regards Toël Hartmann On 11 maj 2017, at 12:47, David Patterson <patterd20...@gmail.com> wrote: > Is is possible to > (a) iterate over the PDF by page [I believe the answer is "Yes"] > (b) extract the text from a page [Don't know] > > This would allow some nice capabilities, but with an added complexity of > words that split between pages. > > Thanks for the info. > > Dave Patterson --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org