Hi,

I try to following the instruction from 
http://lucene.apache.org/nutch/tutorial8.html
.....
Intranet: Configuration
To configure things for intranet crawling you must:1. Create a directory with a 
flat file of root urls.  For example, to
crawl the nutch site you might start with a file named
urls/nutch containing the url of just the Nutch home
page.  All other Nutch pages should be reachable from this page.  The
urls/nutch file would thus contain:
http://lucene.apache.org/nutch/

....

not understand. Can anyone help me out. 

Thanks.
zhou


      New Email addresses available on Yahoo!
Get the Email name you've always wanted on the new @ymail and @rocketmail. 
Hurry before someone else does!
http://mail.promotions.yahoo.com/newdomains/sg/

Reply via email to