Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by susam:
http://wiki.apache.org/nutch/Crawl

The comment on the change is:
fixed typos

------------------------------------------------------------------------------
  == Introduction ==
- This is a script to crawl an Internet or the web. It does not crawl using the 
'bin/crawl' tool or 'Crawl' class present in Nutch, therefore the filters 
present in 'conf/crawl-urlfilter.txt ' has not effect on this script. The 
filters for this script must be set in 'regex-urlfilter.txt'.
+ This is a script to crawl an Intranet as well as the web. It does not crawl 
using the 'bin/crawl' tool or 'Crawl' class present in Nutch. Therefore the 
filters present in 'conf/crawl-urlfilter.txt ' has no effect on this script. 
The filters for this script must be set in 'regex-urlfilter.txt'.
  
  == Steps ==
  The complete job of this script has been divided broadly into 8 steps.

Reply via email to