Hi,

 

I am trying to configure for multiple site indexing using intranet
crawling.  I need help on how to keep the entries in the "urls" flat
file and the crawl-urlfilter.txt files.

 

For example, I want to configure for the below mentioned 2 URLs, 

 

1.http://lucene.apache.org/nutch/

2.http://sourceforge.net/

 

can I have them one after the other on 2 lines in the "urls" flat file ?

 

and in the crawl-urlfilter.txt,  can I have the entries like:

 

+^http://([a-z0-9]*\.)*apache.org/

+^http://([a-z0-9]*\.)*sourceforge.net/

 

 

Can someone help me ?

 

Thanks,

Madhu

 



This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. It may contain proprietary material, confidential 
information and/or be subject to legal privilege. It should not be copied, 
disclosed to, retained or used by, any other party. If you are not an intended 
recipient then please promptly delete this e-mail and any attachment and all 
copies and inform the sender. Thank you.

Reply via email to