Elwin Fri, 10 Feb 2006 01:39:09 -0800
In the process of crawling and indexing, some pages are just used as "temporary links " to the pages I want to index, so how can I control those kinds of pages not being indexed? Or which part of nutch should I extend?