Thank you for your reply Erick.

I've thought about termsquery but it doesn't support phrase search AFAIK,

and I want to query for near words like "Mycobacterium tuberculosis" and also i would like to use

the tilde syntax "Mycobacterium tuberculosis"~2 .

Does it exists a parser for that, so that I can plug it into my query  using the special '_query_' field?

The fq is not an option because I already use it for license filtering.


WRT to the mm parameter, by default it is set to 1 since my q.op=OR


https://stackoverflow.com/questions/32222139/solr-mm-paramter-of-dismax-parser


if I try to set it to 0 explicitly, it gives me the same result for query B (~6.000.000 of documents)


by the way what's the meaning of setting it to 0?

The documentations says that mm parameter can accept a "positive" integer

https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html


Thank you

Danilo


On 15/11/18 18:07, Erick Erickson wrote:
You're using edismax which has the "mm" parameter that you can think
of as a sliding scale between pure OR and pure AND. What happens if
you set it to zero?

As for maxboolean clauses, the easiest/fasted way around that would be
to use an "fq" clause and the TermsQueryParser.

Best,
Erick
On Thu, Nov 15, 2018 at 7:52 AM Danilo Tomasoni <tomas...@cosbi.eu> wrote:
Hello all,

I'm performing some queries with a big list of terms in OR on our solr
instance,

and this odd situation happened


- A. query with N alternatives returns ~130.000 documents

- B. query with N-3 alternatives returns ~ 6.000.000 documents


N is relatively small in this case, but in general can be large.


How it's possible that if I specify less terms to match the number of
results get higher?

The query is fully positive (no - or NOT inside).


Query A/B are attached.

I also tried with debug=all and I noticed

"parsedquery": "+(DisjunctionMaxQuery((abstract_methods:tuberculosi |
... ) DisjunctionMaxQuery( | .. | .. )


just on the first sub-parenthesis of the query. why is that? is this the
reason of the change in number of results? if yes, how can I create a
pure-or query (everything optional?)


If you are wondering why I'm adding sub-parenthesis, that's to avoid the
max boolean clauses error (If you know some other method that allows
phrase searches please tell me)


Thank you

Danilo




--
Danilo Tomasoni
COSBI

As for the European General Data Protection Regulation 2016/679 on the 
protection of natural persons with regard to the processing of personal data, 
we inform you that all the data we possess are object of treatement in the 
respect of the normative provided for by the cited GDPR.

It is your right to be informed on which of your data are used and how; you may 
ask for their correction, cancellation or you may oppose to their use by 
written request sent by recorded delivery to The Microsoft Research – 
University of Trento Centre for Computational and Systems Biology Scarl, Piazza 
Manifattura 1, 38068 Rovereto (TN), Italy.

--
Danilo Tomasoni
COSBI

As for the European General Data Protection Regulation 2016/679 on the 
protection of natural persons with regard to the processing of personal data, 
we inform you that all the data we possess are object of treatement in the 
respect of the normative provided for by the cited GDPR.

It is your right to be informed on which of your data are used and how; you may 
ask for their correction, cancellation or you may oppose to their use by 
written request sent by recorded delivery to The Microsoft Research – 
University of Trento Centre for Computational and Systems Biology Scarl, Piazza 
Manifattura 1, 38068 Rovereto (TN), Italy.

Reply via email to