At 14:37 Uhr +0100 06.05.2002, Alan Fry wrote:
>open(IN, $f);

Problem 1:
open() failes if a filename contains spaces. This is a very
common problem. Even Net::FTP didn't work.
Everybody opening files from the Desktop has to make this
experience.


>my $info_start = index($str, "$info_block 0 obj");

Problem 2:
index() will also find blocks which look like the right one
but are really the wrong objects ("14 0 obj", "4 0 obj").

Below I tried an improved version.
Furthermore I compared performance between Alan's most direct
implementation, the CPAN modules PDF.pm and Text::PDF.

Speed rations between Alan : Text::PDF : PDF.pm are appr.
1 : 6 : 12

For those more interested - I started the whole topic first on
perlmonks. We can continue discussion there. It's only MacPerl
specific because PDF is IMO closer to the Mac world then it is
to the others.


Best regards,
Axel.

sub gettitle {
    use Fcntl;
    my $file = shift;
    local *IN;
    sysopen( IN, $file, O_RDONLY, 0 ) or die "while reading: '$file'\n";
    read IN, my ($str), -s $file;
    close IN;

    my ($info_block) = ( $str =~ /\/Info\s(\d+)\s0\sR/ )
      or die "cannot get /Info paragraph\n";
    my $searchpos = -1;
    my $info_start;
    while (1) {
        $info_start = index( $str, "$info_block 0 obj", $searchpos + 1 );
        die "cannot get position of '$info_block 0 obj'\n"
          if $info_start < $searchpos + 1;
        last if ( substr( $str, $info_start - 1, 1 ) =~ /\015|\012/ );
        $searchpos = $info_start;
    }
    my $info_obj = substr(
        $str, $info_start,
        index( $str, ">>", $info_start ) - $info_start + 2
    );
    my ($title) =
      ( $info_obj =~ /\/Title\s*\(  ([^\015\012|\015|\012]*)  \)  /x )
      or return 'undefined';
    return $title;
}

Reply via email to