On Tue, 27 Jul 1999, Nathaniel Irons wrote:

> On 7/27/99 at 9:05 PM, [EMAIL PROTECTED] (David Melton) wrote:
> 
> > I'm trying to get htdig to only index words in the message files, and
> > ignore the words in the index files.  In other words, I only want
> > htsearch to "hit" on message files, not the links to messages that are
> > in the index files.  
> 
> I think you want exclude_urls, not description_factor.  You specify
> space-separated strings (in mhonarc's case, the current values of the
> IDXFNAME, TIDXFNAME, IDXPREFIX, and TIDXPREFIX resources), and htdig
> ignores urls which contain any of them.

exclude_urls seems to make it entirely ignore the file. In the
case of the index files, this means that it won't traverse the
links to get to the message files.  So, if I put "threads" and
"maillist" in exclude_urls, I wind up with nothing at all!

The description of description_factor is "Plain old "descriptions" 
are the text of a link pointing to a document. This factor gives 
weight to the words of these descriptions of the document."
In the case of MHonArc index files, the link descriptions are 
the subject fields of the messages, which are exactly what I
want it to exclude from any searches.

If I can't figure out any other way, I may just create an html 
file that contains nothing but links to the msg*.html files, 
with no descriptions. That will give htdig a way to find the 
messages, but nothing to clutter up the search results.  
Really tacky, but it ought to work...  

My only other option is to figure out how to add the
"<!--htdig_noindex-->" tags before and "!--\ht..." after all of
the subject text in the indices (using TTOPBEGIN, TTOPEND, etc.)
I don't really like the idea of adding all that clutter and size
to my index files.


> > There are also "previous" and "next" message links in each of the
> > message files, and those should be ignored as well.
> 
> For that, you want to add "<!--htdig_noindex-->" to your mhonarc
> resource file in the MSGBODYEND resource.  It fires at the end of the
> message body; htdig will ignore everything that follows within that
> file.

This makes sense!  It should solve half of my problem.  Thanks!

> I've found that mhonarc and ht://dig play quite well together.

Seems like they should....I think I'm pretty close to having it
working the way I want it.

Thanks,

  Dave



------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word unsubscribe in
the SUBJECT of the message.

Reply via email to