The "variance" is simply likely due to the fact that your "text" field is analyzed differently than the source fields you include in your dismax "qf". For example, maybe some of them may be "string" with no analysis. So, fewer of those fields are matching on your query terms when using dismax.

Look at the results of both queries and then try querying on the specific fields of a document that is found by the traditional Lucene/Solr query parser but not found using dismax.

-- Jack Krupansky

-----Original Message----- From: mechravi25
Sent: Friday, June 15, 2012 1:16 AM
To: solr-user@lucene.apache.org
Subject: Solr Search Count Variance

Hi all,

When we give a search request to solr, the part of the request url to solr
having the search query will be as following

..../select/?qf=name%5e2.3+text+r_name%5e0.3+id%5e0.3+xid%5e0.3&fl=*&f.tFacet.facet.mincount=1&facet.field=tFacet&f.rFacet.facet.mincount=1&facet.field=rFacet&facet=true&hl.fl=*&hl=true&rows=10&start=0&q=test+Log&debugQuery=on?

We find the number of documnts returned to be 5000 (approx.). Here, it makes
use of the standard handler and we get the parsed query as follows

<str name="parsedquery">(text:Cxx1 text:test) (text:Dyy3 text:Log)</str>
<str name="parsedquery_toString">(text:Cxx1 text:test) (text:Dyy3
text:Log)</str>

here, text is the default field and this is used by the standard handler and
it is the destination field for all the other fields.

The same way, when we alter the above url to fetch the result by using the
dismax handler,

..../select/?qf=name%5e2.3+text+r_name%5e0.3+id%5e0.3+xid%5e0.3&qt=dismax&fl=*&f.tFacet.facet.mincount=1&facet.field=tFacet&f.rFacet.facet.mincount=1&facet.field=rFacet&facet=true&hl.fl=*&hl=true&rows=10&start=0&q=test+Log&debugQuery=on?

We find the number of documents found to be 710 and the parsed query is as
follows

<str name="parsedquery">+((DisjunctionMaxQuery((xid:test^0.3 | id:test^0.3 |
((r_name:Cxx1 r_name:test)^0.3) | (text:Cxx1 text:test) | ((name:Cxx1
name:test)^2.3))) DisjunctionMaxQuery((xid:Log^0.3 | id:Log^0.3 |
((r_name:Dyy3 r_name:Log)^0.3) | (text:Dyy3 text:Log) | ((name:Dyy3
name:Log)^2.3))))~2) ()</str>
 <str name="parsedquery_toString">+(((xid:test^0.3 | id:test^0.3 |
((r_name:Cxx1 r_name:test)^0.3) | (text:Cxx1 text:test) | ((name:Cxx1
name:test)^2.3)) (xid:Log^0.3 | id:Log^0.3 | ((r_name:Dyy3 r_name:Log)^0.3)
| (text:Dyy3 text:Log) | ((name:Dyy3 name:Log)^2.3)))~2) ()</str>

If we try to give the boosts like dismax in q parameter for standard, its
working fine i.e. the total number of documents fetched is 710. The query
used is as follows

q:(name:test^2.3 AND name:Log^2.3)OR(text:test AND
text:Log)OR(r_name:test^0.3 AND r_name:Log^0.3)OR(id:test^0.3 AND
id:Log^0.3)OR(xid:test^0.3 AND xid:Log^0.3)

I have two doubts here

1. Why is there a count difference of this extent between the standard and
dismax handler?
2. Does the dismax handler use AND operation in the phrase query (when we
use with/without quotes)?

Can you please explain me the same?

Thanks in advance

--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-Search-Count-Variance-tp3989760.html Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to