According to Bill Akins:
> If I use <!DOCTYPE HTML PUBLIC for noindex_start value and > for
> noindex_stop value, will this keep the page from being indexed while
> still following any links it may contain?

No, for two reasons:  1) noindex_start causes everything from that start
string up to the noindex_end (not noindex_stop) string, to be stripped out
of the document while indexing, so the parser sees nothing in that section
of the file, and 2) using an ending string of simply ">" will cause it
to end the block being stripped out at the first ">" found after the
start string, not the last one, so it won't strip out much.

> I don't want any html files to be indexed since I am only serving up PDF
> files (and the HtDig web pages) but HtDig relies on Apache indexes to
> find the files.  I can not build an index for every directory since we
> add about 1,000 files/week in hundreds of dirs.
> 
> What I am trying to accomplish is not have the "Index of /somedir"
> Apache pages from showing in the search results.  I added bad_querystr:
> ?D=A ?D=D ?M=A ?M=D ?N=A ?N=D ?S=A ?S=D in the .conf but they still
> show.  I found the above string in a FAQ somewhere.

That would be http://www.htdig.org/FAQ.html#q4.23, which you should
re-read more thoroughly, as there are a number of tips in the answer
to that question which you may be able to use.

You could also use a "restrict" input parameter of ".pdf" in htsearch,
to limit search results to PDF files.  See FAQ 4.20.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)

_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to