Your external_parsers: statement looks OK to me, but it seems that htdig is
ignoring it.

Do you get any error messages from htdig which might suggest a problem?
Version 3.2 will ignore all attributes following an error in the
configuration file.

Check also that you have only one external_parsers: statement in your
configuration file, that the line immediately before external_parsers: is
correct and doesn't end in a backslash, and that you are using the correct
configuration file.

David Adams
Corporate Information Services
Information Systems Services
University of Southampton

----- Original Message ----- 
From: "Vidar Ringstrom" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, June 10, 2003 12:21 PM
Subject: [htdig] PDF::parse: cannot open acroread output


>
> What am I missing?
>
> doc2html.pl, pdfinfo and pdftotext works from the commandline.
>
> Some variables from htdig.conf:
>
> start_url:
http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812.pdf
> max_head_length:        10000
> max_doc_size:           15000000
> external_parsers: application/pdf->text/html /usr/local/bin/doc2html.pl
>
>
> Output from:  htdig -vvv
>
> 1:1:http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812.pdf
> New server: www.tfb.no, 80
> Retrieval command for http://www.tfb.no/robots.txt: GET /robots.txt
HTTP/1.0
> User-Agent: htdig/3.1.6 ([EMAIL PROTECTED])
> Authorization: Basic aHRkaWdyb2JvdDplbmhldDg1MA==
> Host: www.tfb.no
>
> Header line: HTTP/1.1 200 OK
> Header line: Date: Tue, 10 Jun 2003 11:09:56 GMT
> Header line: Server: Apache/1.3.27 (Unix) PHP/4.3.1 FrontPage/4.0.4.3
mod_ssl/2.
> 8.14 OpenSSL/0.9.7a
> Header line: Last-Modified: Fri, 25 Oct 2002 07:28:28 GMT
> Converted Fri, 25 Oct 2002 07:28:28 GMT to Fri, 25 Oct 2002 07:28:28
> Header line: ETag: "35f514-32-3db8f29c"
> Header line: Accept-Ranges: bytes
> Header line: Content-Length: 50
> Header line: Connection: close
> Header line: Content-Type: text/plain
> Header line:
> returnStatus = 0
> Read 50 from document
> Read a total of 50 bytes
> Parsing robots.txt file using myname = htdig
> Robots.txt line: # robots.txt
> Robots.txt line: #
> Robots.txt line: User-agent: *
> Found 'user-agent' line: *
> Robots.txt line: Disallow: /cgi-bin/
> Found 'disallow' line: /cgi-bin/
> Pattern: /cgi-bin/
>  pushed
> pick: www.tfb.no, # servers = 1
>
0:0:0:http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812.pdf:
Ret
> rieval command for
http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_13
> 4812.pdf: GET /db/adresseboktrondheim/1905/3_7_20030528_134812.pdf
HTTP/1.0
> User-Agent: htdig/3.1.6 ([EMAIL PROTECTED])
> Authorization: Basic aHRkaWdyb2JvdDplbmhldDg1MA==
> Host: www.tfb.no
>
> Header line: HTTP/1.1 200 OK
> Header line: Date: Tue, 10 Jun 2003 11:09:56 GMT
> Header line: Server: Apache/1.3.27 (Unix) PHP/4.3.1 FrontPage/4.0.4.3
mod_ssl/2.
> 8.14 OpenSSL/0.9.7a
> Header line: Last-Modified: Wed, 28 May 2003 10:51:51 GMT
> Converted Wed, 28 May 2003 10:51:51 GMT to Wed, 28 May 2003 10:51:51
> Header line: ETag: "2d7886-ef80b-3ed494c7"
> Header line: Accept-Ranges: bytes
> Header line: Content-Length: 981003
> Header line: Connection: close
> Header line: Content-Type: application/pdf
> Header line:
> returnStatus = 0
> Read 8192 from document
>
>     -removed several more lines with "Read 8192 from document"
>
> Read 6155 from document
> Read a total of 981003 bytes
> PDF::setContents(981003 bytes)
>
PDF::parse(http://www.tfb.no/db/adresseboktrondheim/1905/3_7_20030528_134812
.pdf
> )
> PDF::parse: cannot open acroread output from
http://www.tfb.no/db/adresseboktron
> dheim/1905/3_7_20030528_134812.pdf
>  size = 981003
> pick: www.tfb.no, # servers = 1
>
>
>
>
> Hilsen Vidar
>
> --
> Vidar Ringstr�m Telefon 33 11 68 00
> Bibliotek-Systemer As Fax 33 11 68 22
> Boks 2093, Stubber�d, 3255 Larvik
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by:  Etnus, makers of TotalView, The best
> thread debugger on the planet. Designed with thread debugging features
> you've never dreamed of, try TotalView 6 free at www.etnus.com.
> _______________________________________________
> htdig-general mailing list <[EMAIL PROTECTED]>
> To unsubscribe, send a message to
<[EMAIL PROTECTED]> with a subject of unsubscribe
> FAQ: http://htdig.sourceforge.net/FAQ.html
>
>



-------------------------------------------------------
This SF.net email is sponsored by:  Etnus, makers of TotalView, The best
thread debugger on the planet. Designed with thread debugging features
you've never dreamed of, try TotalView 6 free at www.etnus.com.
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to