Re: How to manipulate a pdf object

Tilman Hausherr Tue, 29 Mar 2016 12:09:07 -0700

Am 29.03.2016 um 20:46 schrieb Kevin Ternes:

Maruan and Tilman,
I think you have answered my question--that I am basically out of luck.
I already ran one through the usual PDF-Tools Debugger but it did not tell me 
anything that I thought was useful.  I also tried looking at the PDF under 
Acrobat's preflight.


But here is the use case:
I have a large number of PDF "templates" that in our usual business process, we 
use PDFBox to load, set form field values, add images, merge, flatten, protect, . . .

However, it turns out that the specification for many of these templates has 
changed so that a piece of text needs to be moved slightly up, a cm to the left 
and have the font size changed.  Then there are some places where someone drew 
lines around hundreds of form checkboxes!!!  So while I'm at it I'd like to 
delete those lines and set the form field widgets to have a border.

I wanted to write a quick command line program to do this.

Likely won't be possible. What I do is to run the WriteDecodedDoccommand line utility and then do the changes manually. However you needto understand the PDF operators and the sizes of the content streamsshould not change, i.e. all object positions must stay the same.


Alternatively, get Acrobat Professional.

Tilman

I estimate that to do this one-pdf-at-a-time would take 10-20 hours.  That 
would not be a problem except that we don't have an intern.

Any suggestions appreciated.

-----Original Message-----
From: Maruan Sahyoun [mailto:[email protected]]
Sent: Tuesday, March 29, 2016 1:06 PM
To: [email protected]
Subject: Re: How to manipulate a pdf object

Hi,

Am 29.03.2016 um 19:54 schrieb Kevin Ternes <[email protected]>:

I have successfully updated form widgets on pre-existing PDFs.
But what about ordinary non-form objects like a box of text?  I can add NEW 
objects to the PDPageContentStream.
But how do I even get a reference to an existing object?

What is it that you are trying to achieve? You can parse an existing content 
stream and look for individual tokens. But there is no guarantee that, what 
your are calling a box of text, is treated like that in the PDF as there is no 
such concept. E.g. individual lines, word, characters forming a word ... could 
be placed individually in different operations. It even might not be text but a 
vector or bitmap image. Your best bet is to look into the content using the 
PDFDebugger and see if you can identify the parts you are looking for.

Maybe you can elaborate a little more on your use case.

BR
Maruan

Viewing the document in Acrobat does not give me a clue as to what the object 
might even be called.

PDFBox-2.0.0


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: How to manipulate a pdf object

Reply via email to