If you control the "temporary links" pages, then just add a robots meta tag. Take a look at http://www.robotstxt.org/wc/meta-user.html to see what your options are.
Jake. -----Original Message----- From: Elwin [mailto:[EMAIL PROTECTED] Sent: Friday, February 10, 2006 4:38 AM To: [email protected] Subject: How to control contents to be indexed? In the process of crawling and indexing, some pages are just used as "temporary links " to the pages I want to index, so how can I control those kinds of pages not being indexed? Or which part of nutch should I extend?
