Re: QueryParser changes query by itself [solved]
Hi Bernd, > On Aug 22, 2017, at 4:31 AM, Bernd Fehling> wrote: > > But the QueryBuilder only calls "stream.reset()", it never calls > "stream.end()" so that Filters > in the Analyzer chain can't do any cleanup (like my Filter wanted to do). > I moved my "cleanup" into reset() which feels like a dirty hack. > > > My opinion, in lucene QueryBuilder there should be a "stream.end()" after > consuming the stream: > ... > stream.reset(); > while (stream.incrementToken()) { > numTokens++; > ... > } > stream.end(); > ... The stream here is a CachingTokenFilter wrapping the passed-in TokenStream. On first call to cache.incrementToken(), CachingTokenFilter's cache is populated by exhausting the wrapped stream and then calling its end() method. -- Steve www.lucidworks.com
Re: QueryParser changes query by itself [solved]
ery class. >>>> But we didn't change its hashCode method accordingly. This caused >>>> anomalies in Solr, and Yonik found the bug and fixed hashCode. Your e-mail >>>> somehow reminded me this. >>>> Could it be the QueryCache and hashCode method/implementation of Query >>>> subclasses. >>>> May be your good and bad example is producing same hashCode? And this is >>>> confusing query cache in solr? >>>> Can you disable the query cache, to test it? >>>> By the way, which query parser are you using? I believe SynonymQuery is >>>> produced by BM25 similarity, right? >>>> >>>> Ahmet >>>> >>>> >>>> On Friday, August 11, 2017, 2:48:07 PM GMT+3, Bernd Fehling >>>> <bernd.fehl...@uni-bielefeld.de> wrote: >>>> >>>> >>>> We just noticed a very strange problem with Solr 6.4.2 QueryParser. >>>> The QueryParser changes the query by itself from time to time. >>>> This happens if doing a search request reload several times at higher rate. >>>> >>>> Good example: >>>> ... >>>> textth:waffenhandel >>>> >>>> ... >>>> textth:waffenhandel >>>> textth:waffenhandel >>>> +SynonymQuery(Synonym(textth:"arms sales" >>>> textth:"arms trade"... >>>> +Synonym(textth:"arms sales" >>>> textth:"arms trade"... >>>> >>>> >>>> Bad example: >>>> ... >>>> textth:waffenhandel >>>> >>>> ... >>>> textth:waffenhandel >>>> textth:waffenhandel >>>> +textth:rss >>>> +textth:rss >>>> >>>> As you can see in the bad example after several reloads the parsedquery >>>> changed to term "rss". >>>> But the original querystring has no "rss" substring at all. That is really >>>> strange. >>>> >>>> Anyone seen this before? >>>> >>>> Single index, Solr 6.4.2. >>>> >>>> Regards >>>> Bernd >>>>
Re: QueryParser changes query by itself
The queryCache shouldn't be involved, this is somehow an issue in parsing (and Solr doesn't currently cache parsing). Perhaps there is something shared in your SynonymQParser instances that isn't quite thread safe? It could also be something in the text analysis in lucene as well (related to the new graph stuff?) -Yonik On Wed, Aug 16, 2017 at 7:32 AM, Bernd Fehling <bernd.fehl...@uni-bielefeld.de> wrote: > My class SynonymQParser which calls SolrQueryParserBase.parse : > > class SynonymQParser extends QParser { > protected SolrQueryParser sqparser; > ... > @Override > public Query parse() throws SyntaxError { > ... > sqparser = new SolrQueryParser(this, defaultField); > sqparser.setEnableGraphQueries(false); > sqparser.setEnablePositionIncrements(false); > ... > Query synquery = sqparser.parse(qstr); > ... > > And this is SolrQueryParserBase with method parse: > > public abstract class SolrQueryParserBase extends QueryBuilder { > ... > public Query parse(String query) throws SyntaxError { > ReInit(new FastCharStream(new StringReader(query))); > try { > // TopLevelQuery is a Query followed by the end-of-input (EOF) > Query res = TopLevelQuery(null); // pass null so we can tell later > if an explicit field was provided or not > return res!=null ? res : newBooleanQuery().build(); > } > ... > > > The String variable "query" going into parse method is always > "textth:waffenhandel" !!! > Having a breakpoint at "return", the Query variable "res" changes sometimes to > TermQuery with term "textth:rss" instead of being a SynonymQuery. > > This is strange!!! > > What is ReInit right before try doing, is that a cahe lookup? > > Or is the problem in TopLevelQuery? > > Regards > Bernd > > > Am 16.08.2017 um 09:06 schrieb Bernd Fehling: >> Hi Ahmet, >> >> thank you for your reply. I was also targeting towards QueryCache but >> with your hint about LUCENE-3758 I have a better point to start with. >> >> If the system is under high load and the the QueryCache is filled I have >> a higher rate of changed queries. >> In debug mode the "timing-->process-->query" of changed queries is always >> "0" zero. >> >> The query parser "SynonymQParser" is self developed which uses QParserPlugin. >> There is no caching inside and works for years. >> Only compiled against recent Lucene/Solr and some modifications like >> using Builder with newer Lucene versions. >> >> I will test without query cache. >> Wich one should be disabled, Query Result Cache? >> >> Regards >> Bernd >> >> >> Am 15.08.2017 um 19:07 schrieb Ahmet Arslan: >>> Hi Bernd, >>> >>> In LUCENE-3758, a new member field added into ComplexPhraseQuery class. But >>> we didn't change its hashCode method accordingly. This caused anomalies in >>> Solr, and Yonik found the bug and fixed hashCode. Your e-mail somehow >>> reminded me this. >>> Could it be the QueryCache and hashCode method/implementation of Query >>> subclasses. >>> May be your good and bad example is producing same hashCode? And this is >>> confusing query cache in solr? >>> Can you disable the query cache, to test it? >>> By the way, which query parser are you using? I believe SynonymQuery is >>> produced by BM25 similarity, right? >>> >>> Ahmet >>> >>> >>> On Friday, August 11, 2017, 2:48:07 PM GMT+3, Bernd Fehling >>> <bernd.fehl...@uni-bielefeld.de> wrote: >>> >>> >>> We just noticed a very strange problem with Solr 6.4.2 QueryParser. >>> The QueryParser changes the query by itself from time to time. >>> This happens if doing a search request reload several times at higher rate. >>> >>> Good example: >>> ... >>> textth:waffenhandel >>> >>> ... >>> textth:waffenhandel >>> textth:waffenhandel >>> +SynonymQuery(Synonym(textth:"arms sales" >>> textth:"arms trade"... >>> +Synonym(textth:"arms sales" >>> textth:"arms trade"... >>> >>> >>> Bad example: >>> ... >>> textth:waffenhandel >>> >>> ... >>> textth:waffenhandel >>> textth:waffenhandel >>> +textth:rss >>> +textth:rss >>> >>> As you can see in the bad example after several reloads the parsedquery >>> changed to term "rss". >>> But the original querystring has no "rss" substring at all. That is really >>> strange. >>> >>> Anyone seen this before? >>> >>> Single index, Solr 6.4.2. >>> >>> Regards >>> Bernd >>>
Re: QueryParser changes query by itself
My class SynonymQParser which calls SolrQueryParserBase.parse : class SynonymQParser extends QParser { protected SolrQueryParser sqparser; ... @Override public Query parse() throws SyntaxError { ... sqparser = new SolrQueryParser(this, defaultField); sqparser.setEnableGraphQueries(false); sqparser.setEnablePositionIncrements(false); ... Query synquery = sqparser.parse(qstr); ... And this is SolrQueryParserBase with method parse: public abstract class SolrQueryParserBase extends QueryBuilder { ... public Query parse(String query) throws SyntaxError { ReInit(new FastCharStream(new StringReader(query))); try { // TopLevelQuery is a Query followed by the end-of-input (EOF) Query res = TopLevelQuery(null); // pass null so we can tell later if an explicit field was provided or not return res!=null ? res : newBooleanQuery().build(); } ... The String variable "query" going into parse method is always "textth:waffenhandel" !!! Having a breakpoint at "return", the Query variable "res" changes sometimes to TermQuery with term "textth:rss" instead of being a SynonymQuery. This is strange!!! What is ReInit right before try doing, is that a cahe lookup? Or is the problem in TopLevelQuery? Regards Bernd Am 16.08.2017 um 09:06 schrieb Bernd Fehling: > Hi Ahmet, > > thank you for your reply. I was also targeting towards QueryCache but > with your hint about LUCENE-3758 I have a better point to start with. > > If the system is under high load and the the QueryCache is filled I have > a higher rate of changed queries. > In debug mode the "timing-->process-->query" of changed queries is always "0" > zero. > > The query parser "SynonymQParser" is self developed which uses QParserPlugin. > There is no caching inside and works for years. > Only compiled against recent Lucene/Solr and some modifications like > using Builder with newer Lucene versions. > > I will test without query cache. > Wich one should be disabled, Query Result Cache? > > Regards > Bernd > > > Am 15.08.2017 um 19:07 schrieb Ahmet Arslan: >> Hi Bernd, >> >> In LUCENE-3758, a new member field added into ComplexPhraseQuery class. But >> we didn't change its hashCode method accordingly. This caused anomalies in >> Solr, and Yonik found the bug and fixed hashCode. Your e-mail somehow >> reminded me this. >> Could it be the QueryCache and hashCode method/implementation of Query >> subclasses. >> May be your good and bad example is producing same hashCode? And this is >> confusing query cache in solr? >> Can you disable the query cache, to test it? >> By the way, which query parser are you using? I believe SynonymQuery is >> produced by BM25 similarity, right? >> >> Ahmet >> >> >> On Friday, August 11, 2017, 2:48:07 PM GMT+3, Bernd Fehling >> <bernd.fehl...@uni-bielefeld.de> wrote: >> >> >> We just noticed a very strange problem with Solr 6.4.2 QueryParser. >> The QueryParser changes the query by itself from time to time. >> This happens if doing a search request reload several times at higher rate. >> >> Good example: >> ... >> textth:waffenhandel >> >> ... >> textth:waffenhandel >> textth:waffenhandel >> +SynonymQuery(Synonym(textth:"arms sales" >> textth:"arms trade"... >> +Synonym(textth:"arms sales" textth:"arms >> trade"... >> >> >> Bad example: >> ... >> textth:waffenhandel >> >> ... >> textth:waffenhandel >> textth:waffenhandel >> +textth:rss >> +textth:rss >> >> As you can see in the bad example after several reloads the parsedquery >> changed to term "rss". >> But the original querystring has no "rss" substring at all. That is really >> strange. >> >> Anyone seen this before? >> >> Single index, Solr 6.4.2. >> >> Regards >> Bernd >>
Re: QueryParser changes query by itself
Hi Ahmet, thank you for your reply. I was also targeting towards QueryCache but with your hint about LUCENE-3758 I have a better point to start with. If the system is under high load and the the QueryCache is filled I have a higher rate of changed queries. In debug mode the "timing-->process-->query" of changed queries is always "0" zero. The query parser "SynonymQParser" is self developed which uses QParserPlugin. There is no caching inside and works for years. Only compiled against recent Lucene/Solr and some modifications like using Builder with newer Lucene versions. I will test without query cache. Wich one should be disabled, Query Result Cache? Regards Bernd Am 15.08.2017 um 19:07 schrieb Ahmet Arslan: > Hi Bernd, > > In LUCENE-3758, a new member field added into ComplexPhraseQuery class. But > we didn't change its hashCode method accordingly. This caused anomalies in > Solr, and Yonik found the bug and fixed hashCode. Your e-mail somehow > reminded me this. > Could it be the QueryCache and hashCode method/implementation of Query > subclasses. > May be your good and bad example is producing same hashCode? And this is > confusing query cache in solr? > Can you disable the query cache, to test it? > By the way, which query parser are you using? I believe SynonymQuery is > produced by BM25 similarity, right? > > Ahmet > > > On Friday, August 11, 2017, 2:48:07 PM GMT+3, Bernd Fehling > <bernd.fehl...@uni-bielefeld.de> wrote: > > > We just noticed a very strange problem with Solr 6.4.2 QueryParser. > The QueryParser changes the query by itself from time to time. > This happens if doing a search request reload several times at higher rate. > > Good example: > ... > textth:waffenhandel > > ... > textth:waffenhandel > textth:waffenhandel > +SynonymQuery(Synonym(textth:"arms sales" > textth:"arms trade"... > +Synonym(textth:"arms sales" textth:"arms > trade"... > > > Bad example: > ... > textth:waffenhandel > > ... > textth:waffenhandel > textth:waffenhandel > +textth:rss > +textth:rss > > As you can see in the bad example after several reloads the parsedquery > changed to term "rss". > But the original querystring has no "rss" substring at all. That is really > strange. > > Anyone seen this before? > > Single index, Solr 6.4.2. > > Regards > Bernd >
Re: QueryParser changes query by itself
Hi Bernd, In LUCENE-3758, a new member field added into ComplexPhraseQuery class. But we didn't change its hashCode method accordingly. This caused anomalies in Solr, and Yonik found the bug and fixed hashCode. Your e-mail somehow reminded me this. Could it be the QueryCache and hashCode method/implementation of Query subclasses. May be your good and bad example is producing same hashCode? And this is confusing query cache in solr? Can you disable the query cache, to test it? By the way, which query parser are you using? I believe SynonymQuery is produced by BM25 similarity, right? Ahmet On Friday, August 11, 2017, 2:48:07 PM GMT+3, Bernd Fehling <bernd.fehl...@uni-bielefeld.de> wrote: We just noticed a very strange problem with Solr 6.4.2 QueryParser. The QueryParser changes the query by itself from time to time. This happens if doing a search request reload several times at higher rate. Good example: ... textth:waffenhandel ... textth:waffenhandel textth:waffenhandel +SynonymQuery(Synonym(textth:"arms sales" textth:"arms trade"... +Synonym(textth:"arms sales" textth:"arms trade"... Bad example: ... textth:waffenhandel ... textth:waffenhandel textth:waffenhandel +textth:rss +textth:rss As you can see in the bad example after several reloads the parsedquery changed to term "rss". But the original querystring has no "rss" substring at all. That is really strange. Anyone seen this before? Single index, Solr 6.4.2. Regards Bernd
QueryParser changes query by itself
We just noticed a very strange problem with Solr 6.4.2 QueryParser. The QueryParser changes the query by itself from time to time. This happens if doing a search request reload several times at higher rate. Good example: ... textth:waffenhandel ... textth:waffenhandel textth:waffenhandel +SynonymQuery(Synonym(textth:"arms sales" textth:"arms trade"... +Synonym(textth:"arms sales" textth:"arms trade"... Bad example: ... textth:waffenhandel ... textth:waffenhandel textth:waffenhandel +textth:rss +textth:rss As you can see in the bad example after several reloads the parsedquery changed to term "rss". But the original querystring has no "rss" substring at all. That is really strange. Anyone seen this before? Single index, Solr 6.4.2. Regards Bernd