URL filter help

ajaxtrend Mon, 17 Dec 2007 08:54:43 -0800

Hello Group,
                   I need to index URLs that matches a particular URL pattern 
and I have added the pattern in crawl-urlfilter.txt e.g. I want to index all 
urls of www.test.com that are sub directory of product so my regex is
   
  +^http://www.text.com/products/.*
   
  urls/my.txt contains following entry
   
  http://www.text.com, that mean I want to start indexing from main page of 
www.text.com. However nutch does not index anything and when I run nutch it says
   
  No URLs to fetch - check your seed list and URL filters.
  I am sure this muct have been answered. I have already searched archive but 
not able to find any suggestion. 
  I would really appreciate if you can put your valuable suggestion or let me 
know the classes to be looked into.
   
  Thanks in advance.
   
  - BR


       
---------------------------------
Never miss a thing.   Make Yahoo your homepage.

URL filter help

Reply via email to