Opps, I feel like a stoge. max_doc_size did indeed fix the problems with processes 
left over. 

Thanks!!!

Bryan

-----Original Message-----
From: Gilles Detillieux [mailto:[EMAIL PROTECTED]]
Sent: Thursday, June 13, 2002 10:26 AM
To: Bryan C. Woods
Cc: David Adams; [EMAIL PROTECTED]
Subject: Re: [htdig] htdig/ppthtml hangs


According to Bryan C. Woods:
> Thanks for your help David. I haven't yet pinpointed the offending 
> documents, however running:
>  
>  for i in `ps auwwx | egrep '([^ ]* *){10}[^ ]*ppthtml[ \n]' | sed 's/[^ ]\{1,\} 
>\{1,\}\([^]\{1,\}\).\{1,\}/\1/'`; do kill -9 $i; done
>  
> Has worked for at least killing left over (hung?) processes.
>  
>  
> Thanks for your time!
>  
> Bryan

Hi, Bryan.  With all this e-mailing back and forth, you still haven't
answered the question of whether you've tried increasing max_doc_size
and whether that makes the problem go away.  We do know for a fact,
from your very first e-mail, that at least one such .ppt file was being
truncated to 200000 bytes.

You wrote...
>  This is where rundig -vvv hangs:
> 
> Header line: HTTP/1.1 200 OK
> Header line: Server: Microsoft-IIS/5.0
> Header line: MicrosoftOfficeWebServer: 5.0_Collab
> Header line: Date: Wed, 12 Jun 2002 00:49:19 GMT
> Header line: Content-Type: application/vnd.ms-powerpoint
> Header line: Accept-Ranges: bytes
> Header line: Last-Modified: Wed, 29 May 2002 20:26:54 GMT
> Converted Wed, 29 May 2002 20:26:54 GMT to Wed, 29 May 2002 20:26:54
> Header line: ETag: "80ea682c4f7c21:860"
> Header line: Content-Length: 204288
> Header line:
> returnStatus = 0
> Read 8192 from document
> Read 8192 from document
...
> Read 8192 from document
> Read 3392 from document
> Read a total of 200000 bytes

So, somewhere you have a ms-powerpoint file of 204288 bytes, of which
htdig only read a total of 200000 bytes.  You must increase max_doc_size
to at least the size of the largest file you need to index.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)

_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas - 
http://devcon.sprintpcs.com/adp/index.cfm?source=osdntextlink

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to