According to peter karlsson:
> > In any case, it seems it would be a good idea to make htdig remember the
> > content-type of previously indexed documents, and use it by default.
> 
> Or perhaps to assume that pages without a Content-Type are text/html?

So if you re-index, say, a PDF, and phttp assumes you don't need to
be told the content-type again, then you assume the PDF is HTML and
attempt to parse it as such?  Not a good idea.  I'd say, rather, that
if the Content-type is missing, you should base your assumptions of
the type on either the URL suffix, or what the beginning of the file
looks like.  That means we're back to the proposal of adding mime.types
and/or mime-magic processing - this has been suggested before, to extend
local_urls processing, and handle other transport types, but takers for
implementation haven't been forthcoming.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word unsubscribe in
the SUBJECT of the message.

Reply via email to