config help

KRIS MUSSHORN Mon, 12 Dec 2016 11:55:08 -0800

I'm using nutch 1.12 and Solr 5.4.1.  
   
Crawling a website and indexing into nutch.  
  
AFAIK the regex-urlfilter.txt file will cause content to not be crawled..  
   
what if I have  
https://XXXX/inside/default.cfm  as my seed url...  
I want the links on this page to be crawled and indexed but I do not want this 
page to be indexed into SOLR.  
How would I set this up?  
   
I'm thnking that the regex.urlfilter.txt file is NOT the right place.

config help

Reply via email to