Lucene uses a scoring system that behaves similarly to a boolean system. Each piece of the query contributes to the score for each document...if a document scores 0, it is not returned in the results.

 To search for documents that must contain "apples" and may contain
"oranges" use the query:  +apples oranges
This query will score any document without apples as a 0. If a doc contains apples it will get a positive score and if the document also happens to have oranges, it will score higher. An absence of orange will not force a 0 score for a document, but the presence of it will boost the score.

Clearly this is _not_ the same as 'apples OR oranges', which would
This would be : apples oranges
In this case, an absence of either term will not force a 0 score, but if no terms appear the score will be 0. Both terms appearing would score higher than just one.
Conversely, the prohibit operator (-) is called out from the NOT operator:
 To search for documents that contain "apples" but not "oranges" use
the query:  "apples" -"oranges"
I do not understand why this isn't simply equivalent to: apples AND NOT oranges
This is equivalent. The prohibit operator will force a score of 0 on any doc that contains the term. Finding apples might put a positive score on a doc, but then finding oranges will set the score to 0 no matter what score the other terms generated. That is why this cannot be used as a unary not...-oranges would score every doc as a 0 and none would return. If you used the special MatchAllQueries and put it with -oranges you would have the effect of a unary not. MatchAllQueries would score each doc positively, and then - would 0 out all docs that had the - term.

...if it is, why all the big fuss about calling it "prohibit" and not
just another alias for NOT?
...if it isn't, then what's the difference in behavior?
Its kind of like an ANDNOT in boolean terms...


The fact that the documentation calls out these operators separately,
gives them their own unique names, and describes them in different
terms is enough to make me think something very important or very
subtle is going on.
The subtle part is that a scoring system is being used that operates in something of a boolean fashion, but that has subtle difference.

- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to