RE: [SPAM] Re: query parsed in different ways in two identical solr instances
Yes I identical because the configuration (solrconfig.xml etc) is identical, just some fields changed. Sorry I was not so precise in the description of the environment. Nice to know it's already fixed. Danilo Tomasoni Fondazione The Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI) Piazza Manifattura 1, 38068 Rovereto (TN), Italy tomas...@cosbi.eu http://www.cosbi.eu As for the European General Data Protection Regulation 2016/679 on the protection of natural persons with regard to the processing of personal data, we inform you that all the data we possess are object of treatment in the respect of the normative provided for by the cited GDPR. It is your right to be informed on which of your data are used and how; you may ask for their correction, cancellation or you may oppose to their use by written request sent by recorded delivery to The Microsoft Research – University of Trento Centre for Computational and Systems Biology Scarl, Piazza Manifattura 1, 38068 Rovereto (TN), Italy. P Please don't print this e-mail unless you really need to From: Alexandre Rafalovitch [arafa...@gmail.com] Sent: 10 June 2019 15:32 To: solr-user Subject: Re: [SPAM] Re: query parsed in different ways in two identical solr instances Ok, great. We now moved from "identical setup breaks things in a bugfix version" to "strange behavior when field does not exist". The "identical" part was actually throwing us off the trail. And all this leads us to https://issues.apache.org/jira/browse/SOLR-5163 , fixed in 8.0. Hope it helps, Alex. On Mon, 10 Jun 2019 at 09:19, Danilo Tomasoni wrote: > > Hello I was able to reproduce this behaviour in an isolated environment, > and performed some differential analysis between the two versions (that has > different schemas, diff of schemas attached) > > With the schema of solr1, the query is parsed as +(+() +()) > while with the schema of solr-test, the same query is parsed as +(() > ()) > > The query is > > "q":"(f1:PUBMEDPMID12159614 AND (_query_:\"{!edismax > qf='medline_chemical_terms medline_mesh_terms' q.op=OR mm=1 v=$subquery1}\"))" > > in solr1 and also in solr test f1 equals > "f.f1.qf":"id pmid pmc source_id other_id doi manuscript_id publication_id > secondary_ids"}} > > And then I suddenly remembered that the field secondary_ids was renamed to > external_data in solr-test (before the bulk import). > > So I changed f1 definition removing secondary_ids and adding external_data.. > and now the behaviour is the same! > > How is that possible? why the schema (and in this case a non-existing field) > can influence in such a profound way the behaviour of the query parser? > > I think that this is a subtle bug and an error should be raised instead of > performing an unexpected query. > > Danilo Tomasoni > > Fondazione The Microsoft Research - University of Trento Centre for > Computational and Systems Biology (COSBI) > Piazza Manifattura 1, 38068 Rovereto (TN), Italy > tomas...@cosbi.eu > http://www.cosbi.eu > > As for the European General Data Protection Regulation 2016/679 on the > protection of natural persons with regard to the processing of personal data, > we inform you that all the data we possess are object of treatment in the > respect of the normative provided for by the cited GDPR. > It is your right to be informed on which of your data are used and how; you > may ask for their correction, cancellation or you may oppose to their use by > written request sent by recorded delivery to The Microsoft Research – > University of Trento Centre for Computational and Systems Biology Scarl, > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > P Please don't print this e-mail unless you really need to > > > From: Alexandre Rafalovitch [arafa...@gmail.com] > Sent: 10 June 2019 12:49 > To: solr-user > Subject: [SPAM] Re: query parsed in different ways in two identical solr > instances > > Were you able to simplify it to the simplest use case showing the issue? Or > reproduce it on the stock Solr with stock example? Because otherwise, we > would be just as stuck in a Jira as now. It is the same people helping > > For example, is the _query_ part significant? > > Also, did you try running both queries with echoParams=all just to > eliminate stray differences? I know you looked at the debug line, but > perhaps this is worth a check too. > > Regards, > Alex > > > > On Mon, Jun 10, 2019, 5:46 AM Danilo Tomasoni, wrote: > > > Hello all, > > maybe I should consider this as a bug and open an issue? > &
Re: [SPAM] Re: query parsed in different ways in two identical solr instances
Ok, great. We now moved from "identical setup breaks things in a bugfix version" to "strange behavior when field does not exist". The "identical" part was actually throwing us off the trail. And all this leads us to https://issues.apache.org/jira/browse/SOLR-5163 , fixed in 8.0. Hope it helps, Alex. On Mon, 10 Jun 2019 at 09:19, Danilo Tomasoni wrote: > > Hello I was able to reproduce this behaviour in an isolated environment, > and performed some differential analysis between the two versions (that has > different schemas, diff of schemas attached) > > With the schema of solr1, the query is parsed as +(+() +()) > while with the schema of solr-test, the same query is parsed as +(() > ()) > > The query is > > "q":"(f1:PUBMEDPMID12159614 AND (_query_:\"{!edismax > qf='medline_chemical_terms medline_mesh_terms' q.op=OR mm=1 v=$subquery1}\"))" > > in solr1 and also in solr test f1 equals > "f.f1.qf":"id pmid pmc source_id other_id doi manuscript_id publication_id > secondary_ids"}} > > And then I suddenly remembered that the field secondary_ids was renamed to > external_data in solr-test (before the bulk import). > > So I changed f1 definition removing secondary_ids and adding external_data.. > and now the behaviour is the same! > > How is that possible? why the schema (and in this case a non-existing field) > can influence in such a profound way the behaviour of the query parser? > > I think that this is a subtle bug and an error should be raised instead of > performing an unexpected query. > > Danilo Tomasoni > > Fondazione The Microsoft Research - University of Trento Centre for > Computational and Systems Biology (COSBI) > Piazza Manifattura 1, 38068 Rovereto (TN), Italy > tomas...@cosbi.eu > http://www.cosbi.eu > > As for the European General Data Protection Regulation 2016/679 on the > protection of natural persons with regard to the processing of personal data, > we inform you that all the data we possess are object of treatment in the > respect of the normative provided for by the cited GDPR. > It is your right to be informed on which of your data are used and how; you > may ask for their correction, cancellation or you may oppose to their use by > written request sent by recorded delivery to The Microsoft Research – > University of Trento Centre for Computational and Systems Biology Scarl, > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > P Please don't print this e-mail unless you really need to > > > From: Alexandre Rafalovitch [arafa...@gmail.com] > Sent: 10 June 2019 12:49 > To: solr-user > Subject: [SPAM] Re: query parsed in different ways in two identical solr > instances > > Were you able to simplify it to the simplest use case showing the issue? Or > reproduce it on the stock Solr with stock example? Because otherwise, we > would be just as stuck in a Jira as now. It is the same people helping > > For example, is the _query_ part significant? > > Also, did you try running both queries with echoParams=all just to > eliminate stray differences? I know you looked at the debug line, but > perhaps this is worth a check too. > > Regards, > Alex > > > > On Mon, Jun 10, 2019, 5:46 AM Danilo Tomasoni, wrote: > > > Hello all, > > maybe I should consider this as a bug and open an issue? > > > > Danilo Tomasoni > > > > Fondazione The Microsoft Research - University of Trento Centre for > > Computational and Systems Biology (COSBI) > > Piazza Manifattura 1, 38068 Rovereto (TN), Italy > > tomas...@cosbi.eu > > http://www.cosbi.eu > > > > As for the European General Data Protection Regulation 2016/679 on the > > protection of natural persons with regard to the processing of personal > > data, we inform you that all the data we possess are object of treatment in > > the respect of the normative provided for by the cited GDPR. > > It is your right to be informed on which of your data are used and how; > > you may ask for their correction, cancellation or you may oppose to their > > use by written request sent by recorded delivery to The Microsoft Research > > – University of Trento Centre for Computational and Systems Biology Scarl, > > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > > P Please don't print this e-mail unless you really need to > > > > > > From: Danilo Tomasoni > > Sent: 07 June 2019 11:47 > > To: solr-user@lucene.apache.org > > Subject: RE: query parsed in different ways in two identical solr instance
RE: [SPAM] Re: query parsed in different ways in two identical solr instances
Hello I was able to reproduce this behaviour in an isolated environment, and performed some differential analysis between the two versions (that has different schemas, diff of schemas attached) With the schema of solr1, the query is parsed as +(+() +()) while with the schema of solr-test, the same query is parsed as +(() ()) The query is "q":"(f1:PUBMEDPMID12159614 AND (_query_:\"{!edismax qf='medline_chemical_terms medline_mesh_terms' q.op=OR mm=1 v=$subquery1}\"))" in solr1 and also in solr test f1 equals "f.f1.qf":"id pmid pmc source_id other_id doi manuscript_id publication_id secondary_ids"}} And then I suddenly remembered that the field secondary_ids was renamed to external_data in solr-test (before the bulk import). So I changed f1 definition removing secondary_ids and adding external_data.. and now the behaviour is the same! How is that possible? why the schema (and in this case a non-existing field) can influence in such a profound way the behaviour of the query parser? I think that this is a subtle bug and an error should be raised instead of performing an unexpected query. Danilo Tomasoni Fondazione The Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI) Piazza Manifattura 1, 38068 Rovereto (TN), Italy tomas...@cosbi.eu http://www.cosbi.eu As for the European General Data Protection Regulation 2016/679 on the protection of natural persons with regard to the processing of personal data, we inform you that all the data we possess are object of treatment in the respect of the normative provided for by the cited GDPR. It is your right to be informed on which of your data are used and how; you may ask for their correction, cancellation or you may oppose to their use by written request sent by recorded delivery to The Microsoft Research – University of Trento Centre for Computational and Systems Biology Scarl, Piazza Manifattura 1, 38068 Rovereto (TN), Italy. P Please don't print this e-mail unless you really need to From: Alexandre Rafalovitch [arafa...@gmail.com] Sent: 10 June 2019 12:49 To: solr-user Subject: [SPAM] Re: query parsed in different ways in two identical solr instances Were you able to simplify it to the simplest use case showing the issue? Or reproduce it on the stock Solr with stock example? Because otherwise, we would be just as stuck in a Jira as now. It is the same people helping For example, is the _query_ part significant? Also, did you try running both queries with echoParams=all just to eliminate stray differences? I know you looked at the debug line, but perhaps this is worth a check too. Regards, Alex On Mon, Jun 10, 2019, 5:46 AM Danilo Tomasoni, wrote: > Hello all, > maybe I should consider this as a bug and open an issue? > > Danilo Tomasoni > > Fondazione The Microsoft Research - University of Trento Centre for > Computational and Systems Biology (COSBI) > Piazza Manifattura 1, 38068 Rovereto (TN), Italy > tomas...@cosbi.eu > http://www.cosbi.eu > > As for the European General Data Protection Regulation 2016/679 on the > protection of natural persons with regard to the processing of personal > data, we inform you that all the data we possess are object of treatment in > the respect of the normative provided for by the cited GDPR. > It is your right to be informed on which of your data are used and how; > you may ask for their correction, cancellation or you may oppose to their > use by written request sent by recorded delivery to The Microsoft Research > – University of Trento Centre for Computational and Systems Biology Scarl, > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > P Please don't print this e-mail unless you really need to > > > From: Danilo Tomasoni > Sent: 07 June 2019 11:47 > To: solr-user@lucene.apache.org > Subject: RE: query parsed in different ways in two identical solr instances > > any thoughts on that difference in the solr parsing? is it correct that > the first looks like an AND while the second looks like and OR? > Thank you > > Danilo Tomasoni > > Fondazione The Microsoft Research - University of Trento Centre for > Computational and Systems Biology (COSBI) > Piazza Manifattura 1, 38068 Rovereto (TN), Italy > tomas...@cosbi.eu > http://www.cosbi.eu > > As for the European General Data Protection Regulation 2016/679 on the > protection of natural persons with regard to the processing of personal > data, we inform you that all the data we possess are object of treatment in > the respect of the normative provided for by the cited GDPR. > It is your right to be informed on which of your data are used and how; > you may ask for their correction, cancellation or you may oppo