Update of /cvsroot/nutch/nutch/conf
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv15519/conf

Modified Files:
        crawl-tool.xml nutch-default.xml 
Log Message:
Added plugin.includes config parameter that determines which plugins
are included.  By default now only http, html and basic indexing and
search plugins are enabled, rather than all plugins.  This should make
default performance more predictable and reliable going forward.


Index: nutch-default.xml
===================================================================
RCS file: /cvsroot/nutch/nutch/conf/nutch-default.xml,v
retrieving revision 1.60
retrieving revision 1.61
diff -C2 -d -r1.60 -r1.61
*** nutch-default.xml   30 Nov 2004 06:36:00 -0000      1.60
--- nutch-default.xml   9 Dec 2004 17:40:45 -0000       1.61
***************
*** 262,266 ****
  <property>
    <name>fetcher.server.delay</name>
!   <value>5</value>
    <description>The number of seconds the fetcher will delay between 
     successive requests to the same server.</description>
--- 262,266 ----
  <property>
    <name>fetcher.server.delay</name>
!   <value>5.0</value>
    <description>The number of seconds the fetcher will delay between 
     successive requests to the same server.</description>
***************
*** 496,502 ****
  
  <property>
    <name>plugin.excludes</name>
    <value></value>
!   <description>Regular expression naming plugin directory names to exclude.
    </description>
  </property>
--- 496,512 ----
  
  <property>
+   <name>plugin.includes</name>
+   
<value>protocol-http|parse-(text|html)|index-basic|query-(basic|site|url)</value>
+   <description>Regular expression naming plugin directory names to
+   include.  Any plugin not matching this expression is excluded.  By
+   default Nutch includes crawling just HTML and plain text via HTTP,
+   and basic indexing and search plugins.
+   </description>
+ </property>
+ 
+ <property>
    <name>plugin.excludes</name>
    <value></value>
!   <description>Regular expression naming plugin directory names to exclude.  
    </description>
  </property>

Index: crawl-tool.xml
===================================================================
RCS file: /cvsroot/nutch/nutch/conf/crawl-tool.xml,v
retrieving revision 1.3
retrieving revision 1.4
diff -C2 -d -r1.3 -r1.4
*** crawl-tool.xml      16 Jun 2004 17:31:30 -0000      1.3
--- crawl-tool.xml      9 Dec 2004 17:40:44 -0000       1.4
***************
*** 34,38 ****
  <property>
    <name>fetcher.server.delay</name>
!   <value>1</value>
    <description>The number of seconds the fetcher will delay between 
     successive requests to the same server.</description>
--- 34,38 ----
  <property>
    <name>fetcher.server.delay</name>
!   <value>1.0</value>
    <description>The number of seconds the fetcher will delay between 
     successive requests to the same server.</description>



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
Nutch-cvs mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-cvs

Reply via email to