Re: [htdig] DELETED, no excerpt on PDF's

Marcus Valentine Wed, 18 Jul 2001 01:03:05 -0700
You are running under windows, yes? I had the same problem. I could get
output from doc2html.pl at the command line, but the output in the full
installation wasn't finding its way into htdig.

I sidestepped the problem by running under Linux.

At 18:49 17/07/01 +0200, Per-Henrik Persson wrote:
>* David Adams <[EMAIL PROTECTED]> [010717 18:10]:
>> You havn't mentioned any warning messages from doc2html, so it must be
doing
>> something?
>> 
>> Have you tried doc2html from the command line?  The format is:
>> 
>>     doc2html.pl filename application/pdf
>> 
>> Check that the output does contain text extracted from the file.
>> 
>The output is perfectly valid HTML, with a lot of extracted text from
>the pdf. This part seems to work...
>
>> If that is OK, then the problem may be in your configuration file, check
>> that the external_parsers
>> attribute is used correctly.
>> 
>As i said, htdig really runs doc2html.pl, but then htmerge deletes the
>files with the message Deleted, no excerpt: 209/http:/...
>
>The external_parsers part in htdig.conf looks like following
>--snip
>
>external_parsers: application/pdf->text/html /opt/www/htdig/bin/doc2html.pl
>
>--snip
>
>Still doesn't work... As I said before, the same behavior occured when I
>tried using conv_doc.pl


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html
Re: [htdig] DELETED, no excerpt on PDF's

Reply via email to