On Thu, 10 Feb 2005 12:28:19 -0700, Jason Motes <[EMAIL PROTECTED]> wrote:

Hello,

Is there anyway to retrieve the properties from a pdf file using php?

When you right click on a pdf file in windows you can see the title of the file and you can change this property there also.

I wrote a php page that lists all files in a certain directory. I want to be able show the actual title of the document instead of just the file name.

I have searched the manual and google, everything that comes up refers to generating pdfs on the fly, not working with an already made pdf.


Thanks in Advance,

Jason Motes


If you study the structure of pdf in a text editor, you'll notice that it is quite readable. If you go to the end of a document, prefferably a small one, you will see the trailer. Here is al list of all bytepositions of the objects the pdf document contains. These objects can be images, text, page descriptions, page groups and so on. There can also be a reference to the properties of the document wich can contain creator, author, title etc. It is easy to read the objects, but to change them can prove rather difficult, as you will need to update the trailer informaition with new byteposition for all objects that comes after the properties (hard to explain), unless you keep the new info the same length.

But as you only need to extract the title you can do this with a simple regexp.

while(filelistingstuff) {
$document = file_get_contents($pdf);
 if(preg_match("/\/Title\s\((.*?)\)/i",$document,$match)) {
  $title = $match[1];
 } else {
  $title = $filename;
 }
}

Hope this helps.

--
Stian

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Reply via email to