That's perfect. That's exactly what we needed. Thank you!
BTW, I'm running rundig.sh and I'm getting the following output: /usr/true
is absent or unwilling to execute.
What is true? (and that's NOT a philosophical question ;-)
Also, any idea why the excerpt field would be empty when getting search
results?
Ted Stresen-Reuter
On 8/23/02 4:29 PM, "Gilles Detillieux" <[EMAIL PROTECTED]> wrote:
> According to Ted Stresen-Reuter:
>> On a related note, is there any way to customize the TITLE attribute
>> htsearch displays for pdfs? We have over 100 MB of pdfs we index every night
>> and it would be VERY helpful to be able to provide more accurate titles in
>> the search results.
>
> Well, the best way is to edit the PDF description information, in Acrobat
> Exchange, to set the title. That way, the conv_doc.pl or doc2html.pl
> script will pick it up automatically, via pdfinfo.
>
> Failing that, the other option is to put a hook into your Perl script to
> read the alternate title for a given URL from a file. Here's how I did
> it in conv_doc.pl, for some PDFs of scientific papers...
>
> --- contrib/conv_doc.pl.orig Thu Jul 12 09:38:29 2001
> +++ contrib/conv_doc.pl Thu Oct 18 12:23:58 2001
> @@ -71,6 +71,7 @@ $CATPDF = "/usr/bin/pdftotext";
> $PDFINFO = "/usr/bin/pdfinfo";
> #$CATPDF = "/usr/local/bin/pdftotext";
> #$PDFINFO = "/usr/local/bin/pdfinfo";
> +$titlelist = "/home/httpd/html/SCRC/manuscripts/titles.lst";
>
> #########################################
> #
> @@ -183,6 +183,23 @@ if ($ishtml) {
> print "<HTML>\n<head>\n";
>
> # print out the title, if it's set, and not just a file name, or make one up
> +if (-r $titlelist) {
> + if (open(INFO, "grep \"$ARGV[2]\" $titlelist 2>$null |")) {
> + while (<INFO>) {
> + if (/^$ARGV[2]/) {
> + s/^$ARGV[2]\s+//;
> + s/\s+$//;
> + s/\s+/ /g;
> + s/&/\&\;/g;
> + s/</\<\;/g;
> + s/>/\>\;/g;
> + $title = $_;
> + last;
> + }
> + }
> + close INFO;
> + }
> +}
> if ($title eq "" || $title =~ /^[A-G]:[^\s]+\.[Pp][Dd][Ff]$/) {
> @parts = split(/\//, $ARGV[2]); # get the file basename
> $parts[-1] =~ s/%([A-F0-9][A-F0-9])/pack("C", hex($1))/gie;
>
>
> Here, for example, is a line from titles.lst:
>
> http://www.scrc.umanitoba.ca/SCRC/manuscripts/41.pdf Spinal circuitry of
> sensorimotor control of locomotion
-------------------------------------------------------
This sf.net email is sponsored by: Jabber - The world's fastest growing
real-time communications platform! Don't just IM. Build it in!
http://www.jabber.com/osdn/xim
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html