hi
how or where can i define the urls while crawling i want to index only the sites which has a certain link format eg. http://www.myCompany.com/myServlet? (while crawling i have now all the links under my company host but i need more filtering) # accept hosts in MY.DOMAIN.NAME +^http://([a-z0-9]*\.)*myCompany.com/ index all pages whose link starts with "http://www.myCompany.com/myServlet?"..... thnx for any idea regards cem -- View this message in context: http://www.nabble.com/fetch-pattern-tp22101517p22101517.html Sent from the Nutch - User mailing list archive at Nabble.com.
