Geoff Hutchison <[EMAIL PROTECTED]> writes:

> At 9:43 PM -0500 5/29/01, Chris Green wrote:
> >Sorry, I didn't explain that well.  I had to handle local files with no
> >extension.
> 
> It may be worth having a default MIME type like Apache does, for files
> w/o extension. This would obviously be an attribute, say if it was
> left blank, they'd be ignored for local indexing.

Do you mean a config file directive such as default_type:
application/mail-nnml? To do this, it appears code would have to be
added to Retriever.cc and Document.cc. If I'm going to do this, I
probably should upgrade to the latest to see what changes have
occured.

In my quick hack department, per recommendation, I was going to try a
start_url that contained a listing of urls going to each my 
<a href="http://localhost/outbox/2";>outbox/2</a> but the Retriever.cc
logic bit me 2x ( once, level 1 bad url and once a level 2 bad url for
no extension .. thats as far as I got before I went to bed ).

In thinking about it, it seems that the url is validated once when
"collecting" urls, and once when "retrieving" ( hence level1 and
level2).

> >My idea, if it is sane, is to write a parser script to handle
> >text/plain in this special case ( with a dedicated htdig.cfg as I have
> >now )
> 
> I'd think it'd be better to write a specific parser script for mail
> spools and giving them some sort of unique MIME type. Then you don't
> have to worry about your normal text/plain files getting mangled. I'd
> assume that others may be interested in a mailspool parser script.

Thats basically what I was thinking of doing.  Mainly something that
would avoid silly headers and non-text mime/uuencodeing. > 
> 
> (For an example, it would take Subject: lines and make them into
> header fields.)

Good idea. I'll be sure to look at what all the parsing scripts can do
before I go blindly implementing.


-- 
Chris Green <[EMAIL PROTECTED]>
"I'm beginning to think that my router may be confused."

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to