Hi,
> Am 29.09.2016 um 15:47 schrieb Evan Williams <[email protected]>:
>
> By good fortune I got a form in that shows the problem.
>
> https://dl.dropboxusercontent.com/u/25802656/Tracleer%20Patient%20Enrollment%20and%20Consent%20Form%20Revised.pdf
>
> There is a field that Acrobat quite happily calls 'Tracleer 62.5' and
> treats as an entirely normal text field. But of course PDFBox is confused
> by this.
The fieldname is "Tracleer 62.5 Quantity Text" and it's in fact two fields. One
called ""Tracleer 62" with a child called "5 Quantity Text".
If you use
PDField field = acroForm.getField("Tracleer 62.5 Quantity Text");
you'll be fine.
BR
Maruan
>
> That is the kind of thing that I am talking about. And it is very easy to
> manually fix it in Acrobat of course, but I am trying to build automation
> tools and there are usually very important fields (the ones with the dots)
> that provide a great deal of informational content to my tools so they can
> reason about the form.
>
> Thank you for looking at this.
>
> On Sat, Sep 24, 2016 at 3:07 PM, Evan Williams <[email protected]>
> wrote:
>
>> Hi Maruan,
>>
>> The answer to your question is yes, but my problem is that I tend to fix
>> the PDFs every time I find this issue so I am not certain that I have any
>> sitting around that show the problem. But it is easy enough to create. I
>> will just edit a PDF with Acrobat and put a dot in a field name. I will do
>> that later this afternoon.
>>
>> Thank you.
>>
>> On Sat, Sep 24, 2016 at 2:21 PM, Maruan Sahyoun <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>>> Am 24.09.2016 um 17:13 schrieb Evan Williams <[email protected]
>>>> :
>>>>
>>>> I have a problem, but I think it's non-terminal.
>>>>
>>>> I have been using PDFBox to work with forms for about a year and a half,
>>>> and I have a handle on many things, but I have a persistent and
>>> pernicious
>>>> issue with forms where fields have periods ('.') in their name.
>>>
>>> would it be possible to upload a sample to a public location to take a
>>> look.
>>>
>>> BR
>>>
>>> Maruan
>>>
>>>>
>>>> These forms are from external sources and are typically old school
>>>> AcroForms. Because of the nature of the forms (medical), they often
>>> contain
>>>> decimal values like '0.5 mg' or 'W55.21'. These forms do not seem to
>>> have
>>>> ever been meant to be read programatically. They are for human
>>> consumption.
>>>>
>>>> As far as I can tell, '.' is a magic character used by fully qualified
>>>> names that delineates elements of the path. So when I iterate over the
>>>> fields I get a bunch of name fragments as 'PDNonTerminalField's and
>>> regular
>>>> fields.
>>>>
>>>> My current way of dealing with this is to waste the time of a skilled
>>>> graphic designer, or my own time, manually going in and fixing it. This
>>> is
>>>> mostly just an annoyance. But annoyances add up. And I am trying to
>>>> automate as much as I possibly can in dealing with these forms.
>>>>
>>>> *Is there any obvious way to identify this corrupt situation and
>>> correct it*
>>>>
>>>> I wonder if I Am just doing something wrong (I am iterating over the
>>>> fields in the time honored way that the form example that is included
>>> with
>>>> PDFBox uses).
>>>>
>>>> Adobe Acrobat seems perfectly happy to deal with fields containing
>>> periods
>>>> (including, unfortunately, allowing people to create them). So there
>>> must
>>>> be some way to deal with this.
>>>>
>>>> Your advice would be of great service to me.
>>>>
>>>> Thank you.
>>>> --
>>>> *Evan Williams*
>>>> Sr. Software Engineer
>>>> [email protected]
>>>>
>>>> *www.ZappRx.com <http://www.zapprx.com/>*
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>
>>
>>
>> --
>> *Evan Williams*
>> Sr. Software Engineer
>> [email protected]
>>
>> *www.ZappRx.com <http://www.zapprx.com/>*
>>
>>
>
>
> --
> *Evan Williams*
> Sr. Software Engineer
> [email protected]
>
> *www.ZappRx.com <http://www.zapprx.com/>*
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]