"David Adams" <[EMAIL PROTECTED]> writes:

> Your external_parsers: statement looks OK to me, but it seems that htdig is
> ignoring it.
> 
> Do you get any error messages from htdig which might suggest a problem?
> Version 3.2 will ignore all attributes following an error in the
> configuration file.

I'm using 3.1.6. But I do get an error message:

# rundig -vvv > /tmp/htlogg
# DB2 problem...: missing or empty key value specified

What does this mean?

> 
> Check also that you have only one external_parsers: statement in your
> configuration file, that the line immediately before external_parsers: is
> correct and doesn't end in a backslash, and that you are using the correct
> configuration file.

I was planning to include the htdig.conf-file minus comments here. But 
tried the stripped conf-file first and wonders over all wonders it worked!!!

YES :)

Thank you Adam, this made my day  :)

Vidar
> 
> David Adams
> Corporate Information Services
> Information Systems Services
> University of Southampton
> 
> ----- Original Message ----- 
> From: "Vidar Ringstrom" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Tuesday, June 10, 2003 12:21 PM
> Subject: [htdig] PDF::parse: cannot open acroread output
> 
> 
> >
> > What am I missing?
> >
> > doc2html.pl, pdfinfo and pdftotext works from the commandline.
> >
> > Some variables from htdig.conf:
> >
> > start_url:
> http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812.pdf
> > max_head_length:        10000
> > max_doc_size:           15000000
> > external_parsers: application/pdf->text/html /usr/local/bin/doc2html.pl
> >
> >
> > Output from:  htdig -vvv
> >
> > 1:1:http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812.pdf
> > New server: www.tfb.no, 80
> > Retrieval command for http://www.tfb.no/robots.txt: GET /robots.txt
> HTTP/1.0
> > User-Agent: htdig/3.1.6 ([EMAIL PROTECTED])
> > Authorization: Basic aHRkaWdyb2JvdDplbmhldDg1MA==
> > Host: www.tfb.no
> >
> > Header line: HTTP/1.1 200 OK
> > Header line: Date: Tue, 10 Jun 2003 11:09:56 GMT
> > Header line: Server: Apache/1.3.27 (Unix) PHP/4.3.1 FrontPage/4.0.4.3
> mod_ssl/2.
> > 8.14 OpenSSL/0.9.7a
> > Header line: Last-Modified: Fri, 25 Oct 2002 07:28:28 GMT
> > Converted Fri, 25 Oct 2002 07:28:28 GMT to Fri, 25 Oct 2002 07:28:28
> > Header line: ETag: "35f514-32-3db8f29c"
> > Header line: Accept-Ranges: bytes
> > Header line: Content-Length: 50
> > Header line: Connection: close
> > Header line: Content-Type: text/plain
> > Header line:
> > returnStatus = 0
> > Read 50 from document
> > Read a total of 50 bytes
> > Parsing robots.txt file using myname = htdig
> > Robots.txt line: # robots.txt
> > Robots.txt line: #
> > Robots.txt line: User-agent: *
> > Found 'user-agent' line: *
> > Robots.txt line: Disallow: /cgi-bin/
> > Found 'disallow' line: /cgi-bin/
> > Pattern: /cgi-bin/
> >  pushed
> > pick: www.tfb.no, # servers = 1
> >
> 0:0:0:http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812.pdf:
> Ret
> > rieval command for
> http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_13
> > 4812.pdf: GET /db/adresseboktrondheim/1905/3_7_20030528_134812.pdf
> HTTP/1.0
> > User-Agent: htdig/3.1.6 ([EMAIL PROTECTED])
> > Authorization: Basic aHRkaWdyb2JvdDplbmhldDg1MA==
> > Host: www.tfb.no
> >
> > Header line: HTTP/1.1 200 OK
> > Header line: Date: Tue, 10 Jun 2003 11:09:56 GMT
> > Header line: Server: Apache/1.3.27 (Unix) PHP/4.3.1 FrontPage/4.0.4.3
> mod_ssl/2.
> > 8.14 OpenSSL/0.9.7a
> > Header line: Last-Modified: Wed, 28 May 2003 10:51:51 GMT
> > Converted Wed, 28 May 2003 10:51:51 GMT to Wed, 28 May 2003 10:51:51
> > Header line: ETag: "2d7886-ef80b-3ed494c7"
> > Header line: Accept-Ranges: bytes
> > Header line: Content-Length: 981003
> > Header line: Connection: close
> > Header line: Content-Type: application/pdf
> > Header line:
> > returnStatus = 0
> > Read 8192 from document
> >
> >     -removed several more lines with "Read 8192 from document"
> >
> > Read 6155 from document
> > Read a total of 981003 bytes
> > PDF::setContents(981003 bytes)
> >
> PDF::parse(http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812
> .pdf
> > )
> > PDF::parse: cannot open acroread output from
> http://www.tfb.no/db/adresseboktron
> > dheim/1905/3_7_20030528_134812.pdf
> >  size = 981003
> > pick: www.tfb.no, # servers = 1
> >
> >
> >
> >
> > Hilsen Vidar
> >
> > --
> > Vidar Ringstr�m Telefon 33 11 68 00
> > Bibliotek-Systemer As Fax 33 11 68 22
> > Boks 2093, Stubber�d, 3255 Larvik
> >
> >
> > -------------------------------------------------------
> > This SF.net email is sponsored by:  Etnus, makers of TotalView, The best
> > thread debugger on the planet. Designed with thread debugging features
> > you've never dreamed of, try TotalView 6 free at www.etnus.com.
> > _______________________________________________
> > htdig-general mailing list <[EMAIL PROTECTED]>
> > To unsubscribe, send a message to
> <[EMAIL PROTECTED]> with a subject of unsubscribe
> > FAQ: http://htdig.sourceforge.net/FAQ.html
> >
> >

-- 

Hilsen Vidar

--
Vidar Ringstr�m                         Telefon 33 11 68 00
Bibliotek-Systemer As           Fax 33 11 68 22
Boks 2093, Stubber�d, 3255 Larvik


-------------------------------------------------------
This SF.net email is sponsored by:  Etnus, makers of TotalView, The best
thread debugger on the planet. Designed with thread debugging features
you've never dreamed of, try TotalView 6 free at www.etnus.com.
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to