Willie Alberty wrote:
Although you can open a PDF file in a text editor and more or less
follow its structure, it is not a text file. PDF documents are binary
files. You can irreparably damage a PDF by doing string replacement
operations.
The reason for this is the document trailer which appears at the end of
every PDF file. This is an array of byte offsets to the various objects
contained within the document. If you do a string replacement that
changes the byte length of the string, you've wrecked this offsets
table, and the PDF viewer will be unable to read the document.
If you're very careful to maintain the byte length of the strings you're
replacing, you can actually change existing PDF content in this way, but
you're treading on thin ice. If you keep the byte offsets in document
trailer updated, you can change string lengths too, but this gets to be
rather difficult.
thanks for that thorough explanation and thanks for the fop-link,
thomas. will take a look at it.
thanks,
kai