An update to my recent patch for setupWords() in htsearch.cc
because I identified some more goo.

It seems it is always wrong to remove words in "boolean" search,
as this will leave a dangling "or", "and" or "not" operator:
if you badword "cat", your "boolean" search for "cat or dog"
will just say "or dog" in $(WORDS). 

This may still be incomplete; I would rather remove this
"filtering" of what to keep in $(WORDS) entirely; it only
half-heartedly removes badworded words and tries to skip the
"hidden" on-the-fly modifiers (those the user wrote inline in
the query such as "hidden:" and "exact:", see the
code and Mr Scherpbier's recent mail with the message-id
<[EMAIL PROTECTED]>, not in the archive yet).

This done for no good reason IMHO -- I think $(WORDS) should be
kept unmodified as the user wrote it; only for the *user* to
modify.

But that would be a change in function more than a fix for an
abnormal situation, so I will not make a patch for it until I
know if that's acceptable.  (So? ;-)

This patch is a *replacement* for my recent patch (it was
easiest for me this way, as that one wasn't in CVS yet. :-)


By the way, is this address ([EMAIL PROTECTED]) really appropriate
for patches?  <URL:http://dev.htdig.org/patches.html> says they
should go here ("the htdig mailing list"), but I think
htdig3-dev would be better.   Thoughts?


Sun Jan 11 02:42:51 1999  Hans-Peter Nilsson  <[EMAIL PROTECTED]>

        * htsearch/htsearch.cc (setupWords): Do not skip words
        if "boolean" search.

*** /tmp/htsearch.cc.orig       Sat Dec 19 17:55:11 1998
--- ./htsearch.cc       Tue Jan 12 02:13:18 1999
*************** setupWords(char *allWords, List &searchW
*** 417,427 ****
            i++;
            continue;
        }
!       if (badWords.IsValid(p))
            parsedWords << p << ' ';
!       if (boolean && ((mystrncasecmp(p, "or", 2) == 0) || 
!                       (mystrncasecmp(p, "and", 3) == 0) ||
!                       (mystrncasecmp(p, "not", 3) == 0)))
            parsedWords << p << ' ';
      }
  
--- 450,458 ----
            i++;
            continue;
        }
!       if (boolean)
            parsedWords << p << ' ';
!       else if (badWords.IsValid(p))
            parsedWords << p << ' ';
      }
  
brgds, H-P
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

Reply via email to