Hi Mac,

you can use PDFTextStripper for this.
it will return you all texts from pages


Best regards
Juraj Lonc


GI-BÓN, spol. s r.o.
Management Systems

Bratislavská 11
SK - 010 01 Žilina
Tel: +421-41-564 3437-8
Mobil: +421-907-815 147
Fax: +421-41-564 3439
e-mail: jl...@gi-bon.sk
homepage: http://www.gi-bon.sk 





From:   Mac P <pon...@hotmail.com>
To:     pdfbox <users@pdfbox.apache.org>, 
Date:   01. 09. 2012 10:02
Subject:        How can I manipulate text in PDF'd by using PDFBox




Hello Forum

Is there any way to to split a master pdf file consisted of so many pages 
into separate pages based on the content or keywords in each page?

Each page has the person's first and last name. I would like to grep the 
last name and write a scripts to separate each page, turn it into a new 
pdf file with the last name being part of the file name instead of 
sequential numbers matching the total number of pages at the end of each 
file name.

I know PDFs are binary documents. Are there any tools to look up the last 
names and manipulate them that way?

Thanks

Mac
  

Reply via email to