Author: dogacan
Date: Thu Oct  2 02:17:23 2008
New Revision: 701052

URL: http://svn.apache.org/viewvc?rev=701052&view=rev
Log:
NUTCH-640 - confusing description "set it to Integer.MAX_VALUE"

Modified:
    lucene/nutch/trunk/CHANGES.txt
    lucene/nutch/trunk/conf/nutch-default.xml
    lucene/nutch/trunk/src/java/org/apache/nutch/indexer/Indexer.java

Modified: lucene/nutch/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/lucene/nutch/trunk/CHANGES.txt?rev=701052&r1=701051&r2=701052&view=diff
==============================================================================
--- lucene/nutch/trunk/CHANGES.txt (original)
+++ lucene/nutch/trunk/CHANGES.txt Thu Oct  2 02:17:23 2008
@@ -281,6 +281,9 @@
 103. NUTCH-654 - urlfilter-regex's main does not work.
      (dogacan)
 
+104. NUTCH-640 - confusing description "set it to Integer.MAX_VALUE".
+     (dogacan)
+
 Release 0.9 - 2007-04-02
 
  1. Changed log4j confiquration to log to stdout on commandline

Modified: lucene/nutch/trunk/conf/nutch-default.xml
URL: 
http://svn.apache.org/viewvc/lucene/nutch/trunk/conf/nutch-default.xml?rev=701052&r1=701051&r2=701052&view=diff
==============================================================================
--- lucene/nutch/trunk/conf/nutch-default.xml (original)
+++ lucene/nutch/trunk/conf/nutch-default.xml Thu Oct  2 02:17:23 2008
@@ -634,8 +634,8 @@
   from the index tokens that occur further in the document. If you
   know your source documents are large, be sure to set this value
   high enough to accomodate the expected size. If you set it to
-  Integer.MAX_VALUE, then the only limit is your memory, but you
-  should anticipate an OutOfMemoryError.
+  -1, then the only limit is your memory, but you should anticipate
+  an OutOfMemoryError.
   </description>
 </property>
 

Modified: lucene/nutch/trunk/src/java/org/apache/nutch/indexer/Indexer.java
URL: 
http://svn.apache.org/viewvc/lucene/nutch/trunk/src/java/org/apache/nutch/indexer/Indexer.java?rev=701052&r1=701051&r2=701052&view=diff
==============================================================================
--- lucene/nutch/trunk/src/java/org/apache/nutch/indexer/Indexer.java (original)
+++ lucene/nutch/trunk/src/java/org/apache/nutch/indexer/Indexer.java Thu Oct  
2 02:17:23 2008
@@ -90,6 +90,9 @@
       final Path temp =
         job.getLocalPath("index/_"+Integer.toString(new Random().nextInt()));
 
+      int maxTokens = job.getInt("indexer.max.tokens", 10000);
+      if (maxTokens < 0) maxTokens = Integer.MAX_VALUE;
+
       fs.delete(perm, true);                            // delete old, if any
 
       final AnalyzerFactory factory = new AnalyzerFactory(job);
@@ -102,7 +105,7 @@
       writer.setMaxMergeDocs(job.getInt("indexer.maxMergeDocs", 
Integer.MAX_VALUE));
       writer.setTermIndexInterval
         (job.getInt("indexer.termIndexInterval", 128));
-      writer.setMaxFieldLength(job.getInt("indexer.max.tokens", 10000));
+      writer.setMaxFieldLength(maxTokens);
       writer.setInfoStream(LogUtil.getInfoStream(LOG));
       writer.setUseCompoundFile(false);
       writer.setSimilarity(new NutchSimilarity());


Reply via email to