Hi again.  If any of you have been wondering why the minimum_word_length
attribute seems to have no effect on words indexed in plain text files,
here's the fix:

--- htdig/Plaintext.cc.nominword        Tue Feb 16 23:03:53 1999
+++ htdig/Plaintext.cc  Fri Mar 19 10:39:54 1999
@@ -67,6 +67,7 @@ Plaintext::parse(Retriever &retriever, U
 
     unsigned char       *position = (unsigned char *) contents->get();
     unsigned char      *start = position;
+    static int minimumWordLength = config.Value("minimum_word_length", 3);
     int                offset = 0;
     int                in_space = 0;
     String     word;
@@ -94,11 +95,11 @@ Plaintext::parse(Retriever &retriever, U
                head << word;
            }
 
-           if (word.length() > 2)
+           if (word.length() >= minimumWordLength)
            {
                word.lowercase();
                word.remove(valid_punctuation);
-               if (word.length() > 2)
+               if (word.length() >= minimumWordLength)
                {
                    retriever.got_word(word,
                                       int(offset * 1000 / contents->length()),

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to