Hi,

I also had a look. A somewhat simpler approach which should work, at
least for the fields I've looked at, is to add a T entry to the widgets
which don't have a T entry in the COSDictionary. This would - after
reloading - treat them as fields.

The approach Tilman suggested is more complete and gives you more
control.

BR
Maruan

Am Dienstag, dem 12.08.2025 um 15:14 +0200 schrieb Tilman Hausherr:
> Am 12.08.2025 um 14:50 schrieb Ulf Dittmer:
> > For OBJ2 that makes sense, as it is the same info on both pages.
> > But
> > filling in any of OBJ4, OBJ9 or OBJ10 (to name just a few), that
> > data
> > appears on both page 1 and 3, in fields that have nothing to do
> > with one
> > another.
> 
> OBJ4 is also on several pages (which is allowed). I opened the PDF in
> Adobe and entered my first name, and it then appeared on page 3 at
> the 
> "wrong" place, but that's the problem of whoever created (or altered)
> the PDF.
> 
> I think what you really want is to consider "Is there a way within
> the 
> PDFBox API to dis-ambiguate those fields" as an isolated question,
> i.e. 
> create a new field for each of the extra widgets.
> 
> Yes it would be possible. You'd have to create a new COSDictionary,
> copy 
> all the key/values (except kids, except T and AP), then create a new 
> PDField from that dictionary, add one of the widgets (and delete it
> from 
> the original field), calculate a new "T" value (field name).
> 
> I don't know if there is a commercial tool for this. It can probably
> be 
> done with PDFBox in less than a day. I might help for free but I'd 
> prefer you try first.
> 
> Tilman
> 
> 
> > 
> > org.apache.pdfbox.examples.interactive.form.PrintFields only lists
> > those
> > fields once, but they do appear to be used on multiple pages.
> > 
> > Ulf
> > 
> > On Tue, Aug 12, 2025 at 2:41 PM Tilman
> > Hausherr<thaush...@t-online.de>
> > wrote:
> > 
> > > Hi,
> > > 
> > > I don't see how these field names are double. Some of the fields
> > > have
> > > several widgets, e.g. OBJ2 is on page 1 and page 3. This is done
> > > to have
> > > the content on several pages.
> > > 
> > > Tilman
> > > 
> > > Am 12.08.2025 um 14:14 schrieb Ulf Dittmer:
> > > > Hello-
> > > > 
> > > > I'm encountering PDFs with forms that have non-unique field
> > > > names.
> > > > Sometimes fields with the same names are used for the same
> > > > information (a
> > > > useful scenario, making filling them out programmatically
> > > > easier). But
> > > > sometimes the same names are used for entirely different field
> > > > purposes.
> > > > 
> > > > Is there a way within the PDFBox API to dis-ambiguate those
> > > > fields? Or
> > > are
> > > > there tools that can do this (we do have a budget, so payware
> > > > would be
> > > OK,
> > > > within limits)?
> > > > 
> > > > These are government PDFs, so we don't control their creation.
> > > > But if
> > > there
> > > > is a way to edit them that addresses this, that would also work
> > > > for us -
> > > > the forms do not change frequently.
> > > > 
> > > > http://ulfdittmer.com/Guide_TH.pdf is an example of such a PDF.
> > > > OBJ3,
> > > OBJ4,
> > > > OBJ10, OBJ18 and OBJ24 are field names that are used twice.
> > > > 
> > > > Any help would be appreciated.
> > > > 
> > > > Ulf
> > > > 
> > > 
> > > -----------------------------------------------------------------
> > > ----
> > > To unsubscribe, e-mail:users-unsubscr...@pdfbox.apache.org
> > > For additional commands, e-mail:users-h...@pdfbox.apache.org
> > > 
> > > 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to