Your external_parsers: statement looks OK to me, but it seems that htdig is ignoring it.
Do you get any error messages from htdig which might suggest a problem? Version 3.2 will ignore all attributes following an error in the configuration file. Check also that you have only one external_parsers: statement in your configuration file, that the line immediately before external_parsers: is correct and doesn't end in a backslash, and that you are using the correct configuration file. David Adams Corporate Information Services Information Systems Services University of Southampton ----- Original Message ----- From: "Vidar Ringstrom" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Tuesday, June 10, 2003 12:21 PM Subject: [htdig] PDF::parse: cannot open acroread output > > What am I missing? > > doc2html.pl, pdfinfo and pdftotext works from the commandline. > > Some variables from htdig.conf: > > start_url: http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812.pdf > max_head_length: 10000 > max_doc_size: 15000000 > external_parsers: application/pdf->text/html /usr/local/bin/doc2html.pl > > > Output from: htdig -vvv > > 1:1:http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812.pdf > New server: www.tfb.no, 80 > Retrieval command for http://www.tfb.no/robots.txt: GET /robots.txt HTTP/1.0 > User-Agent: htdig/3.1.6 ([EMAIL PROTECTED]) > Authorization: Basic aHRkaWdyb2JvdDplbmhldDg1MA== > Host: www.tfb.no > > Header line: HTTP/1.1 200 OK > Header line: Date: Tue, 10 Jun 2003 11:09:56 GMT > Header line: Server: Apache/1.3.27 (Unix) PHP/4.3.1 FrontPage/4.0.4.3 mod_ssl/2. > 8.14 OpenSSL/0.9.7a > Header line: Last-Modified: Fri, 25 Oct 2002 07:28:28 GMT > Converted Fri, 25 Oct 2002 07:28:28 GMT to Fri, 25 Oct 2002 07:28:28 > Header line: ETag: "35f514-32-3db8f29c" > Header line: Accept-Ranges: bytes > Header line: Content-Length: 50 > Header line: Connection: close > Header line: Content-Type: text/plain > Header line: > returnStatus = 0 > Read 50 from document > Read a total of 50 bytes > Parsing robots.txt file using myname = htdig > Robots.txt line: # robots.txt > Robots.txt line: # > Robots.txt line: User-agent: * > Found 'user-agent' line: * > Robots.txt line: Disallow: /cgi-bin/ > Found 'disallow' line: /cgi-bin/ > Pattern: /cgi-bin/ > pushed > pick: www.tfb.no, # servers = 1 > 0:0:0:http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812.pdf: Ret > rieval command for http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_13 > 4812.pdf: GET /db/adresseboktrondheim/1905/3_7_20030528_134812.pdf HTTP/1.0 > User-Agent: htdig/3.1.6 ([EMAIL PROTECTED]) > Authorization: Basic aHRkaWdyb2JvdDplbmhldDg1MA== > Host: www.tfb.no > > Header line: HTTP/1.1 200 OK > Header line: Date: Tue, 10 Jun 2003 11:09:56 GMT > Header line: Server: Apache/1.3.27 (Unix) PHP/4.3.1 FrontPage/4.0.4.3 mod_ssl/2. > 8.14 OpenSSL/0.9.7a > Header line: Last-Modified: Wed, 28 May 2003 10:51:51 GMT > Converted Wed, 28 May 2003 10:51:51 GMT to Wed, 28 May 2003 10:51:51 > Header line: ETag: "2d7886-ef80b-3ed494c7" > Header line: Accept-Ranges: bytes > Header line: Content-Length: 981003 > Header line: Connection: close > Header line: Content-Type: application/pdf > Header line: > returnStatus = 0 > Read 8192 from document > > -removed several more lines with "Read 8192 from document" > > Read 6155 from document > Read a total of 981003 bytes > PDF::setContents(981003 bytes) > PDF::parse(http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812 .pdf > ) > PDF::parse: cannot open acroread output from http://www.tfb.no/db/adresseboktron > dheim/1905/3_7_20030528_134812.pdf > size = 981003 > pick: www.tfb.no, # servers = 1 > > > > > Hilsen Vidar > > -- > Vidar Ringstr�m Telefon 33 11 68 00 > Bibliotek-Systemer As Fax 33 11 68 22 > Boks 2093, Stubber�d, 3255 Larvik > > > ------------------------------------------------------- > This SF.net email is sponsored by: Etnus, makers of TotalView, The best > thread debugger on the planet. Designed with thread debugging features > you've never dreamed of, try TotalView 6 free at www.etnus.com. > _______________________________________________ > htdig-general mailing list <[EMAIL PROTECTED]> > To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe > FAQ: http://htdig.sourceforge.net/FAQ.html > > ------------------------------------------------------- This SF.net email is sponsored by: Etnus, makers of TotalView, The best thread debugger on the planet. Designed with thread debugging features you've never dreamed of, try TotalView 6 free at www.etnus.com. _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

