In any PDF file there are usually a number (sometimes hundreds) of lines beginning "/Title", one of which is the title of the PDF in question. If it has one, that is.
The attached script, which is really very small, and which I hope will provide a moment or two's innocent amusement, aims to extract the right /Title line. It seems to work with encoded and un-encoded PDF files with Mac, Unix and Windows line-breaks (even one file with a mixture of all three) and runs quite fast. I would be very grateful to hear from anyone who succeeds in breaking it or alternatively finds any use for it. I hope this isn't too far OT... Alan Fry ----------- #!perl -w use strict; my $start = (times)[0]; my $f = $ARGV[0]; print "$f\n"; open(IN, $f); read IN, my($str), -s $f; close IN; $str =~ /\/Info\s(\d+)\s0\sR/; my $info_block = $1; my $info_start = index($str, "$info_block 0 obj"); my $info_obj = substr $str, $info_start, index($str, ">>", $info_start)-$info_start+2; my $title = $info_obj =~ /\/Title\s*\(([^\015\012|\015|\012]*)\)/ ? "= $1" : 'undefined'; print "/Title $title\n"; my $finish = (times)[0]; print 'Time taken ', $finish-$start, "\n"; ------------