I'm not certain I understand you.  Is your problem that you have indexed
index.html pages, and don't want them returned on a search?

If so then there are two solutions.  The simpler is to use the "exclude"
option on the search form:

<input type=hidden name=exclude value="index.html">

The other is to add
<META name="robots" content="noindex, follow">
to the head of each file you wish to exclude and then rerun htdig & htmerge.

--
David Adams
Computing Services
Southampton University


----- Original Message -----
From: "Jeff Johnson" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, April 17, 2001 5:54 PM
Subject: [htdig] Can I exclude file names from search?


> I just got htdig setup and working.  We are currently saving the documents
as the case number (ex: 400cv9999.pdf).  I have added the line
extra_word_characters:: and modified search_algorithm: to include
substring:1 in the htdig.conf file.  If you search for 4:00cv9999 (we don't
use the semi colon in the file name), it will give you a listing of all
documents that have that case number in it.  The problem is, if they enter a
partial search, cv9999, it will list the documents, but it also shows
listings for other documents.  These just point back to the directory the
documents are saved in.  It appears that when it indexes, it is indexing the
file name also.  Is there a way to exclude those extra listings?  Thanks.
>
> Jeff Johnson
>
>
> _______________________________________________
> htdig-general mailing list <[EMAIL PROTECTED]>
> To unsubscribe, send a message to
<[EMAIL PROTECTED]> with a subject of unsubscribe
> FAQ: http://htdig.sourceforge.net/FAQ.html
>


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to