Flatten the form fields before searching the file if you want PDFTextStripper to find the text in them.
On Thu, Mar 21, 2024 at 12:10 PM Paul Grütter <paul.gruet...@signotec.de.invalid> wrote: > Hello list, > > > > I want to search for words in a PDF document and get their positions. It > seems that PDFBox ignores text which has been entered into a form field > although it’s rendered correctly. I can be reproduced easily with the > standalone app: > > > > java -jar pdfbox-app-3.0.2.jar export:text -i=Test.pdf > > java -jar pdfbox-app-3.0.2.jar render -i=Test.pdf > > > > The Acrobat both finds and extracts text which have been entered into a > form field. > > > > In my code I use PDFTextStripper. I haven’t found any way to configure the > behaviour. Is it a bug or have I overlooked something? For clarification: I > don’t want to search for the value (‘V’) but its visual representation > (‘AP’). > > > > Kind regards, > > > > Dipl.-Ing. (FH) Paul Grütter > > Head of Development > > > > *[image: Beschreibung: Beschreibung: Beschreibung: > signotec_eSig_96dpi_192x44px_cmyk-]* > > > > *signotec GmbH* > > Am Gierath 20b > 40885 Ratingen (Germany) > > > > Tel.: +49 2102 53575-10 > Fax: +49 2102 53575-39 > > > > E-Mail: paul.gruet...@signotec.de > > URL: www.signotec.com > > > Amtsgericht Düsseldorf: HRB 44307 > Geschäftsführung/CEO: Arne Brandes > > <https://www.facebook.com/signotecgmbh/> > <https://www.instagram.com/signotec_gmbh/> > <https://www.linkedin.com/company/signotec-gmbh/> > <https://www.xing.com/pages/signotecgmbh> > <https://www.youtube.com/user/signotec1> > > > > <https://en.signotec.com/sustainability> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org >