I'm trying to understand the specifics behind the notation +(...) and -(...) as it applies to the standard parser.
I have three lists of words. I want documents that have at least one word from list A and also at least one word from list B (just one list isn't enough), and, finally, no documents can contain any words in list C. I believe the correct syntax for that is: +(apple animal aspirin) +(bacon banana book) -candle -computer -currency Can someone confirm that? What I'm trying to convincing myself of is that -(candle computer currency) doesn't do what one thinks it might at first glance. However, Lucene seems to be giving the correct answer, though I'm having a hard time understanding why. Let me explain with some simple pseudo code X = 1 IF (X != 1 OR X !=2) THEN True ELSE False It ought to come as no surprise that this actually evaluates to True. The reason is that X != 1 is false, and X != 2 is true, and false or true is ...true. More interestingly, this statement should always be true (because if X is 2, it can't also be 1, making that part of the subexpression true). Thus, moving back to Lucene from trivial boolean algebra, the notation -(candle OR computer OR currency), would, in my mind, match any and all documents unless every word in the negation list was found. Clearly this can't be right. Is the minus operator distributive? I suspect what I'm seeing is the reality that Lucene is not doing boolean logic at all, but set operations. A co-worker of mine came up with an interesting syntax, and I had no idea what it meant either: +( -A -B ) ...which to him it meat "must have no A and no B". Can anyone clarify how + and - work on groups, and if the above has any coherent meaning? -Walt Stoneburner