On Tue, 22 Jan 2008 16:45:36 -0800, Scott Hess wrote > MATCH "foo -bar" should return "The set of documents which match foo > but not bar". I read MATCH "-bar" as "The set of all documents which > do not match bar". MATCH "-foo -bar" would be "The set of all > documents which do not match foo and do not match bar".
Each match word adds a constraint which filters down the results. Each word in the match query disqualifies all documents that have or don't have that word, depending on the presence or absence of a leading -, respectively. Is that the right way to look at things? This seems consistent with SQL's design of adding constraints to queries which default to yielding all rows, and it also seems to match up with your above explanation. But it doesn't explain the empty match query case, in which no constraints are given. Previously I had thought that the result set is initially empty, that words lacking -'s add to the set, and words with -'s remove from the set. This is consistent with the fact that an empty match query returns zero results, and this reasoning predicts that a match query consisting only of -words will also give zero results. But this isn't how you explain things. > I'm not sure how the empty-string results matter, as I don't consider > MATCH "-foo" to have an implicit empty term. Okay, I agree that it's logical for a match query consisting only of negated words to return all rows lacking those words, and I am fine with this particular case being unsupported. (There are other "missing" features in SQLite that are more important to me, like recursive triggers and foreign key constraints, so I don't mind waiting on this one.) However, it's also logical (I think--- show me where I'm wrong) for an empty match query to return all rows, which is an unsupported operation. Yet rather than fail with an SQL logic error, fts3 yields zero rows in this case. I find this to be inconsistent, and I'd rather have both throw errors or both return zero results. > That's what I'm saying. Calculating the set of all documents which > match "foo" and the set of all documents which match "bar" and > removing the latter from the former is conveniently available from > the fts index. Calculating the set of all documents in the fts > index would require running a separate query to figure it out. I take from your discussion that the fts index keeps track of all rows that contain a given word. Is it reasonable to add one more entry to the index that lists all rows? The indexed "word" could even be empty string, as in all rows contain empty string. :^) Then in any match query lacking nonnegated words (i.e. empty match query or entirely negative match query), the match words' index sets are intersected with or subtracted from this index of everything, as if the match query indeed has an implicit empty term. -- Andy Goth | <[EMAIL PROTECTED]> | http://andy.junkdrome.org/ ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------