If you control the "temporary links" pages, then just add a
robots meta tag.  Take a look at
http://www.robotstxt.org/wc/meta-user.html to see what your options are.

Jake.

-----Original Message-----
From: Elwin [mailto:[EMAIL PROTECTED] 
Sent: Friday, February 10, 2006 4:38 AM
To: [email protected]
Subject: How to control contents to be indexed?

In the process of crawling and indexing, some pages are just used as
"temporary links " to the pages I want to index, so how can I control
those
kinds of pages not being indexed? Or which part of nutch should I
extend?


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to