According to David Adams:
> I suggest a modification of the action of htdig as regards max_doc_size.  At
> present (3.1.5 and 3.* I think) htdig fetches upto max_doc_size bytes and no
> more.  I suggest that it stops fetching as soon as it establishes that the
> document is larger than max_doc_size.  As the size is often given in the
> HTTP header this could prevent it fetching megabytes of .PPT and .PDF only
> for the conversion utilities to fail because they are given incomplete
> files.  Does that sound reasonable?

The problem is that for text files, and especially text/html, truncating
the file may be preferable to tossing it out entirely.  At least that seems
to have been Andrew's intent when he wrote the original code.  It's just
with the added support for binary file types that this really poses a
problem.  I dunno, I have a bad feeling that if we implemented this we'd
get a lot of complaints saying "hey, I needed that truncation feature!"

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to