Opps, I feel like a stoge. max_doc_size did indeed fix the problems with processes left over.
Thanks!!! Bryan -----Original Message----- From: Gilles Detillieux [mailto:[EMAIL PROTECTED]] Sent: Thursday, June 13, 2002 10:26 AM To: Bryan C. Woods Cc: David Adams; [EMAIL PROTECTED] Subject: Re: [htdig] htdig/ppthtml hangs According to Bryan C. Woods: > Thanks for your help David. I haven't yet pinpointed the offending > documents, however running: > > for i in `ps auwwx | egrep '([^ ]* *){10}[^ ]*ppthtml[ \n]' | sed 's/[^ ]\{1,\} >\{1,\}\([^]\{1,\}\).\{1,\}/\1/'`; do kill -9 $i; done > > Has worked for at least killing left over (hung?) processes. > > > Thanks for your time! > > Bryan Hi, Bryan. With all this e-mailing back and forth, you still haven't answered the question of whether you've tried increasing max_doc_size and whether that makes the problem go away. We do know for a fact, from your very first e-mail, that at least one such .ppt file was being truncated to 200000 bytes. You wrote... > This is where rundig -vvv hangs: > > Header line: HTTP/1.1 200 OK > Header line: Server: Microsoft-IIS/5.0 > Header line: MicrosoftOfficeWebServer: 5.0_Collab > Header line: Date: Wed, 12 Jun 2002 00:49:19 GMT > Header line: Content-Type: application/vnd.ms-powerpoint > Header line: Accept-Ranges: bytes > Header line: Last-Modified: Wed, 29 May 2002 20:26:54 GMT > Converted Wed, 29 May 2002 20:26:54 GMT to Wed, 29 May 2002 20:26:54 > Header line: ETag: "80ea682c4f7c21:860" > Header line: Content-Length: 204288 > Header line: > returnStatus = 0 > Read 8192 from document > Read 8192 from document ... > Read 8192 from document > Read 3392 from document > Read a total of 200000 bytes So, somewhere you have a ms-powerpoint file of 204288 bytes, of which htdig only read a total of 200000 bytes. You must increase max_doc_size to at least the size of the largest file you need to index. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas - http://devcon.sprintpcs.com/adp/index.cfm?source=osdntextlink _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

