bypass crawl-urlfilter.txt

shantanu Thu, 02 Jun 2011 13:48:34 -0700

Hi,  I am trying to use nutch ti crawl websites one at a time. However, i do
not want to use crawl-urlfilter.txt for filtering urls. Instead, I want to
be able to do that using some class from nutch, but I am not sure which. Can
somebody guide me with this.


Example - I would say crawl http://www.amazon.com and it should not look
into crawl-urlfilter.txt but instead do it through the java program itself

--
View this message in context: 
http://lucene.472066.n3.nabble.com/bypass-crawl-urlfilter-txt-tp3017143p3017143.html
Sent from the Nutch - User mailing list archive at Nabble.com.

bypass crawl-urlfilter.txt

Reply via email to