My first thought is that a further improvement to doc2html.pl is required:
it should be possible to get it to limit the output it sends back to htdig.
I'm tied up with other work at present, so I can't do this myself for while.

I have had a less severe problem with the utility pptHtml from the same
source:
some files cause the pptHtml process to use hundreds of megabytes of memory.
My "fix" was to add

    limit vmemory 300m

to the Bourne shell script that invokes htdig.
The process then complains it is out of memory and dies gracefully.

David Adams
University of Southampton
Computing Services

----- Original Message -----
From: "Marcus Valentine" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, November 22, 2001 3:19 PM
Subject: [htdig] xlhtml 0.3 crashing htdig 3.1.5


> On my intranet, there is a unfortunate xls file. Although the xls file is
> only 266 kb big, converting it with xlhtml 0.3 at the command line results
> in a 37 Mb html file. (Running with the -a option [aggressive html
> optimization] reduces the file size to 23 Mb).
>
> Running htdog 3.1.5 with doc2html.pl version 3 calling xlhtml 0.3 results
> in an htdig core dump when it gets to this document. Htdig runs on Linux
> Redhat 6.2
>
> Any ideas?
>
> Marcus Valentine
>
> Relevant section of htdig -vvv is below
>
> * * * *
>
> pick: tiger, # servers = 1
> 20:20:1:http://tiger/projects/TOR027/Non-RCS/Results/death_data.xls:
> Retrieval command for http://tiger/projects/TOR02$
> User-Agent: htdig/3.1.5 ([EMAIL PROTECTED])
> Referer: http://tiger/projects/TOR027/Non-RCS/Results/
> Host: tiger
>
> Header line: HTTP/1.1 200 OK
> Header line: Date: Thu, 22 Nov 2001 14:34:07 GMT
> Header line: Server: Apache/1.3.22 (Win32)
> Header line: Last-Modified: Mon, 05 Mar 2001 17:25:49 GMT
> Translated Mon, 05 Mar 2001 17:25:49 GMT to 2001-03-05 17:25:49 (101)
> And converted to Mon, 05 Mar 2001 17:25:49
> Header line: ETag: "0-42800-3aa3cc1d"
> Header line: Accept-Ranges: bytes
> Header line: Content-Length: 272384
> Header line: Connection: close
> Header line: Content-Type: application/vnd.ms-excel
> Header line:
> returnStatus = 0
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 8192 from document
> Read 2048 from document
> Read a total of 272384 bytes
> Aborted (core dumped)
>
> * * * *
>
> _______________________________________________
> htdig-general mailing list <[EMAIL PROTECTED]>
> To unsubscribe, send a message to
<[EMAIL PROTECTED]> with a subject of unsubscribe
> FAQ: http://htdig.sourceforge.net/FAQ.html
>


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to