O. Lavell wrote:
> Hi group,
> I am looking for an easy way to manipulate (read, write) the metadata 
> (title, subject, keywords, author) in PDF files through PHP.
> Most PHP/PDF solutions I have found so far (through Google) are aimed at 
> constructing PDFs from text and graphics, with lots of fancy features, 
> but most of them omit metadata functions altogether.
> I would also prefer something extremely lightweight that I could just 
> include_once() into my script, i.e. not a module or external program. I 
> am currently using pdfinfo from xpdf-utils, but it has to go.
> My use case is I want to build a database with the metadata of a bunch 
> (many hundreds, perhaps thousands) of PDF files in a directory on the 
> server for easy search, statistics and retrieval. I also want users to be 
> able to make edits to any PDF's metadata from the web.
> If it can be at all avoided, I would rather not have to invent the wheel 
> myself here. I have looked at the Adobe PDF specification a bit and it 
> looks quite... challenging. Or should I say daunting.
> Any and all suggestions are welcome. Thank you in advance.

So many people ask about manipulating, editing and generally processing PDF
files. In my experience, PDF is a write-once format - any manipulation should
have been done in whatever source generated the PDF. I think of a PDF as being a
piece of paper: if you want to change the content of a piece of paper it is
usually best to chuck it away and start again...

Even more so, this would apply to the PDF metadata: metadata is supposed to
describe the nature of the document: it's author, creation time etc. That sort
of data should be maintained with the document and ideally not changed
throughout the document's lifetime (like the footer, or end-papers in a physical

I do accept that the metadata should be machine-readable: that part of your
project is reasonable and I'm fairly sure that ought to be possible with
something simple. The best bet I found so far is PDFTK
(http://www.pdfhacks.com/pdftk/) which is a command-line tool that you could
presumably call with exec or whatever...

