Shawn,
Thank you for the reply. The URL you gave was helpful and Smiley Pugh even
more so. On Smiley Pugh page 140, they indicate that mm=100% using dismax is
analogous to Standard's q.op=AND. This is exactly what I need.
However...testing with these queries and edismax, I get different # of results:
q=Title:(life hope) AND Title:(life)q.op=AND (STANDARD Q.P.) - 1 result
q=Title:(life AND hope) AND Title:(life)defType=edismax - 1 result
q=Title:(life hope) AND Title:(life)defType=edismaxmm=100% - 285 results
(ut-oh. looks like the first 2 get OR'ed)
The dismax parser seems to behave as documented:
q=life hope lifedefType=dismaxrows=0qf=Titlemm=0% - 285 results (results
are OR'ed as expected)
q=life hope lifedefType=dismaxrows=0qf=Titlemm=100% - 1 result (results are
AND'ed as expected)
Unfortunately I need to be able to combine the use of pf with key:value
syntax, wildcards, etc, so I need to use edismax, I think.
With a quick glance at ExtendedDismaxQParserPlugin, I'm finding...
- MM is ignored if there are any of these operators in the query (OR NOT + -)
... but AND is ok (line 227)
- MM is ignored if the parse method did not return a BooleanQuery instance
(line 244)
- MM is used after all regardless of operators used in the query, so long as
its a BooleanQuery (line 286)
- The default MM value is 100% if not specified in the query parameters
(lines 241, 283)
Given the apparent contradiction here, my very quick analysis is surely missing
something! But if this is accurate, then the trick is to formulate the query
in such a way so that parse returns an instance of BooleanQuery, right?
Any more advice anyone can give is appreciated! For the client I'm responsible
for, I'm just inserting explicit operators between all of the user's queries.
But for the client I'm not responsible for I would love to have a workaround
for the other developers! I think they'd appreciate it...
James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311
-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org]
Sent: Wednesday, December 22, 2010 4:08 PM
To: solr-user@lucene.apache.org
Subject: Re: edismax inconsistency -- AND/OR
On 12/22/2010 8:25 AM, Dyer, James wrote:
I'm using SOLR 1.4.1 with SOLR-1553 applied (edismax query parser). I'm
experiencing inconsistent behavior with terms grouped in parenthesis.
Sometimes they are AND'ed and sometimes OR'ed together.
1. q=Title:(life)defType=edismax 285 results
2. q=Title:(hope)defType=edismax 34 results
3. q=Title:(life AND hope)defType=edismax 1 result
4. q=Title:(life OR hope)defType=edismax 318 results
5. q=Title:(life hope)defType=edismax 1 result (life, hope are being
AND'ed together)
6. q=Title:(life AND hope) AND Title:(life)defType=edismax 1 result
7. q=Title:(life OR hope) AND Title:(life)defType=edismax 285 result
8. q=Title:(life hope) AND Title:(life)defType=edismax 285 results (life,
hope are being OR'ed together)
See how in #5, the two terms get AND'ed, but by adding the additional
(nonsense) clause in #8, the first two terms get OR'ed . Is this a feature
or a bug? Am I likely doing something wrong?
The dismax parser doesn't pay any attention to the default query
operator. in the absence of these values in the actual query, edismax
likely doesn't either. What matters is the value of the mm variable,
also known as minimum 'should' match. If your mm value is 50%, which
is a common value to see in dismax examples, I believe it would behave
exactly like you are seeing.
This is a complex little beast. Just a couple of weeks ago, Chris
Hostetter said that although he wrote the code and the syntax for mm,
the explanation for the parameter that's in the Smiley and Pugh Solr
book (pages 138-140) is the clearest he's ever seen.
Here's some detailed documentation on it. I can't find my copy of the
book right now, so I don't know if this is as good as what's in it:
http://lucene.apache.org/solr/api/org/apache/solr/util/doc-files/min-should-match.html
Hopefully this is applicable to you, and not something you already
thought of!
Shawn