Flatten the form fields before searching the file if you want
PDFTextStripper to find the text in them.

On Thu, Mar 21, 2024 at 12:10 PM Paul Grütter
<paul.gruet...@signotec.de.invalid> wrote:

> Hello list,
>
>
>
> I want to search for words in a PDF document and get their positions. It
> seems that PDFBox ignores text which has been entered into a form field
> although it’s rendered correctly. I can be reproduced easily with the
> standalone app:
>
>
>
> java -jar pdfbox-app-3.0.2.jar export:text -i=Test.pdf
>
> java -jar pdfbox-app-3.0.2.jar render -i=Test.pdf
>
>
>
> The Acrobat both finds and extracts text which have been entered into a
> form field.
>
>
>
> In my code I use PDFTextStripper. I haven’t found any way to configure the
> behaviour. Is it a bug or have I overlooked something? For clarification: I
> don’t want to search for the value (‘V’) but its visual representation
> (‘AP’).
>
>
>
> Kind regards,
>
>
>
> Dipl.-Ing. (FH) Paul Grütter
>
> Head of Development
>
>
>
> *[image: Beschreibung: Beschreibung: Beschreibung:
> signotec_eSig_96dpi_192x44px_cmyk-]*
>
>
>
> *signotec GmbH*
>
> Am Gierath 20b
> 40885 Ratingen (Germany)
>
>
>
> Tel.: +49 2102 53575-10
> Fax: +49 2102 53575-39
>
>
>
> E-Mail: paul.gruet...@signotec.de
>
> URL: www.signotec.com
>
>
> Amtsgericht Düsseldorf: HRB 44307
> Geschäftsführung/CEO: Arne Brandes
>
> <https://www.facebook.com/signotecgmbh/>
> <https://www.instagram.com/signotec_gmbh/>
> <https://www.linkedin.com/company/signotec-gmbh/>
> <https://www.xing.com/pages/signotecgmbh>
> <https://www.youtube.com/user/signotec1>
>
>
>
> <https://en.signotec.com/sustainability>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org
>

Reply via email to