That's perfect. That's exactly what we needed. Thank you!

BTW, I'm running rundig.sh and I'm getting the following output: /usr/true
is absent or unwilling to execute.

What is true? (and that's NOT a philosophical question ;-)

Also, any idea why the excerpt field would be empty when getting search
results?

Ted Stresen-Reuter

On 8/23/02 4:29 PM, "Gilles Detillieux" <[EMAIL PROTECTED]> wrote:

> According to Ted Stresen-Reuter:
>> On a related note, is there any way to customize the TITLE attribute
>> htsearch displays for pdfs? We have over 100 MB of pdfs we index every night
>> and it would be VERY helpful to be able to provide more accurate titles in
>> the search results.
> 
> Well, the best way is to edit the PDF description information, in Acrobat
> Exchange, to set the title.  That way, the conv_doc.pl or doc2html.pl
> script will pick it up automatically, via pdfinfo.
> 
> Failing that, the other option is to put a hook into your Perl script to
> read the alternate title for a given URL from a file.  Here's how I did
> it in conv_doc.pl, for some PDFs of scientific papers...
> 
> --- contrib/conv_doc.pl.orig    Thu Jul 12 09:38:29 2001
> +++ contrib/conv_doc.pl    Thu Oct 18 12:23:58 2001
> @@ -71,6 +71,7 @@ $CATPDF = "/usr/bin/pdftotext";
> $PDFINFO = "/usr/bin/pdfinfo";
> #$CATPDF = "/usr/local/bin/pdftotext";
> #$PDFINFO = "/usr/local/bin/pdfinfo";
> +$titlelist = "/home/httpd/html/SCRC/manuscripts/titles.lst";
> 
> #########################################
> #
> @@ -183,6 +183,23 @@ if ($ishtml) {
> print "<HTML>\n<head>\n";
> 
> # print out the title, if it's set, and not just a file name, or make one up
> +if (-r $titlelist) {
> +    if (open(INFO, "grep \"$ARGV[2]\" $titlelist 2>$null |")) {
> +        while (<INFO>) {
> +            if (/^$ARGV[2]/) {
> +                s/^$ARGV[2]\s+//;
> +                s/\s+$//;
> +                s/\s+/ /g;
> +                s/&/\&amp\;/g;
> +                s/</\&lt\;/g;
> +                s/>/\&gt\;/g;
> +                $title = $_;
> +                last;
> +            }
> +        }
> +        close INFO;
> +    }
> +}
> if ($title eq "" || $title =~ /^[A-G]:[^\s]+\.[Pp][Dd][Ff]$/) {
>    @parts = split(/\//, $ARGV[2]);         # get the file basename
>    $parts[-1] =~ s/%([A-F0-9][A-F0-9])/pack("C", hex($1))/gie;
> 
> 
> Here, for example, is a line from titles.lst:
> 
> http://www.scrc.umanitoba.ca/SCRC/manuscripts/41.pdf    Spinal circuitry of
> sensorimotor control of locomotion



-------------------------------------------------------
This sf.net email is sponsored by: Jabber - The world's fastest growing 
real-time communications platform! Don't just IM. Build it in! 
http://www.jabber.com/osdn/xim
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to