Let's try to keep the discussion on the list, so that I don't have to be the one designated "Answer Guy", shall we?
According to pp: > > > >User-agent: * > > > >Disallow: /page > > > > > > > >This should disallow all robots index all pages within /page. > > > >Right? > > > > > > Nope, you should disallow an entire directory with a slash at the > > > end, like this: /page/ > > Looks like my mistake :) > These page dissalowed are not really indexed, > but remained problem is I want dissalow index also links to these > pages. > How may I make that? If you mean you want to exclude from the index any pages that contain links to "/page", that's not easily done. htdig can't do it on its own. You'd need to use some other means to find all of these pages, and build an exclude_urls list from that. > removed all htdig databes > edited robots.txt like that and funny thing is that > I have 6 pages indexed from 1700. > A bug? Why do so many people immediately jump to the conclusion that it must be a bug if htdig doesn't index all their files on the first try? There are many, many possible reasons why it might not find them all. See http://www.htdig.org/FAQ.html#q5.25 and the questions to which it refers. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) ------------------------------------------------------- This sf.net email is sponsored by: OSDN - Tired of that same old cell phone? Get a new here for FREE! https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390 _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

