RE: Crawl and Index specific links on specific page

anish_88 Fri, 13 Dec 2013 04:24:12 -0800

ok Let me put it this way

I have a urls folder in which their is a nutch.txt file having url 
http://nutch.apache.org/


Now in regex-urfilter I have this entry  
+^http://([a-z0-9]*\.)*nutch.apache.org/downloads.html

Desn't crawler go to   nutch.apache.org/downloads.html and crawl all the
link  underlying this page if I invoke the command

./nutch crawl url -dir newcrawler depth 3




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Crawl-and-Index-specific-links-on-specific-page-tp4106524p4106584.html
Sent from the Nutch - User mailing list archive at Nabble.com.

RE: Crawl and Index specific links on specific page

Reply via email to