An update to my recent patch for setupWords() in htsearch.cc
because I identified some more goo.
It seems it is always wrong to remove words in "boolean" search,
as this will leave a dangling "or", "and" or "not" operator:
if you badword "cat", your "boolean" search for "cat or dog"
will just say "or dog" in $(WORDS).
This may still be incomplete; I would rather remove this
"filtering" of what to keep in $(WORDS) entirely; it only
half-heartedly removes badworded words and tries to skip the
"hidden" on-the-fly modifiers (those the user wrote inline in
the query such as "hidden:" and "exact:", see the
code and Mr Scherpbier's recent mail with the message-id
<[EMAIL PROTECTED]>, not in the archive yet).
This done for no good reason IMHO -- I think $(WORDS) should be
kept unmodified as the user wrote it; only for the *user* to
modify.
But that would be a change in function more than a fix for an
abnormal situation, so I will not make a patch for it until I
know if that's acceptable. (So? ;-)
This patch is a *replacement* for my recent patch (it was
easiest for me this way, as that one wasn't in CVS yet. :-)
By the way, is this address ([EMAIL PROTECTED]) really appropriate
for patches? <URL:http://dev.htdig.org/patches.html> says they
should go here ("the htdig mailing list"), but I think
htdig3-dev would be better. Thoughts?
Sun Jan 11 02:42:51 1999 Hans-Peter Nilsson <[EMAIL PROTECTED]>
* htsearch/htsearch.cc (setupWords): Do not skip words
if "boolean" search.
*** /tmp/htsearch.cc.orig Sat Dec 19 17:55:11 1998
--- ./htsearch.cc Tue Jan 12 02:13:18 1999
*************** setupWords(char *allWords, List &searchW
*** 417,427 ****
i++;
continue;
}
! if (badWords.IsValid(p))
parsedWords << p << ' ';
! if (boolean && ((mystrncasecmp(p, "or", 2) == 0) ||
! (mystrncasecmp(p, "and", 3) == 0) ||
! (mystrncasecmp(p, "not", 3) == 0)))
parsedWords << p << ' ';
}
--- 450,458 ----
i++;
continue;
}
! if (boolean)
parsedWords << p << ' ';
! else if (badWords.IsValid(p))
parsedWords << p << ' ';
}
brgds, H-P
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.