Geoff Hutchison <[EMAIL PROTECTED]> writes:
> At 9:43 PM -0500 5/29/01, Chris Green wrote:
> >Sorry, I didn't explain that well. I had to handle local files with no
> >extension.
>
> It may be worth having a default MIME type like Apache does, for files
> w/o extension. This would obviously be an attribute, say if it was
> left blank, they'd be ignored for local indexing.
Do you mean a config file directive such as default_type:
application/mail-nnml? To do this, it appears code would have to be
added to Retriever.cc and Document.cc. If I'm going to do this, I
probably should upgrade to the latest to see what changes have
occured.
In my quick hack department, per recommendation, I was going to try a
start_url that contained a listing of urls going to each my
<a href="http://localhost/outbox/2">outbox/2</a> but the Retriever.cc
logic bit me 2x ( once, level 1 bad url and once a level 2 bad url for
no extension .. thats as far as I got before I went to bed ).
In thinking about it, it seems that the url is validated once when
"collecting" urls, and once when "retrieving" ( hence level1 and
level2).
> >My idea, if it is sane, is to write a parser script to handle
> >text/plain in this special case ( with a dedicated htdig.cfg as I have
> >now )
>
> I'd think it'd be better to write a specific parser script for mail
> spools and giving them some sort of unique MIME type. Then you don't
> have to worry about your normal text/plain files getting mangled. I'd
> assume that others may be interested in a mailspool parser script.
Thats basically what I was thinking of doing. Mainly something that
would avoid silly headers and non-text mime/uuencodeing. >
>
> (For an example, it would take Subject: lines and make them into
> header fields.)
Good idea. I'll be sure to look at what all the parsing scripts can do
before I go blindly implementing.
--
Chris Green <[EMAIL PROTECTED]>
"I'm beginning to think that my router may be confused."
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html