FWIW: we have also seen serious Query of Death issues after our upgrade to Solr 7.6. Are there any open issues we can watch? Is Markus' findings around `pf` our best guess? We've seen these issues even with ps=0. We also use the WDF.
On Fri, Feb 22, 2019 at 8:58 AM Markus Jelsma <markus.jel...@openindex.io> wrote: > Hello Michael, > > Sorry it took so long to get back to this, too many things to do. > > Anyway, yes, we have WDF on our query-time analysers. I uploaded two log > files, both the same query of death with and without synonym filter enabled. > > https://mail.openindex.io/export/solr-8983-console.log 23 MB > https://mail.openindex.io/export/solr-8983-console-without-syns.log 1.9 MB > > Without the synonym we still see a huge number of entries. Many different > parts of our analyser chain contribute to the expansion of queries, but pf > itself really turns the problem on or off. > > Since SOLR-12243 is new in 7.6, does anyone know that SOLR-12243 could > have this side-effect? > > Thanks, > Markus > > > -----Original message----- > > From:Michael Gibney <mich...@michaelgibney.net> > > Sent: Friday 8th February 2019 17:19 > > To: solr-user@lucene.apache.org > > Subject: Re: Query of Death Lucene/Solr 7.6 > > > > Hi Markus, > > As of 7.6, LUCENE-8531 < > https://issues.apache.org/jira/browse/LUCENE-8531> > > reverted a graph/Spans-based phrase query implementation (introduced in > 6.5 > > -- LUCENE-7699 <https://issues.apache.org/jira/browse/LUCENE-7699>) to > an > > implementation that builds a separate phrase query for each possible > > enumerated path through the graph described by a parsed query. > > The potential for combinatoric explosion of the enumerated approach was > (as > > far as I can tell) one of the main motivations for introducing the > > Spans-based implementation. Some real-world use cases would be good to > > explore. Markus, could you send (as an attachment) the debug toString() > for > > the queries with/without synonyms enabled? I'm also guessing you may have > > WordDelimiterGraphFilter on the query analyzer? > > As an alternative to disabling pf, LUCENE-8531 only reverts to the > > enumerated approach for phrase queries where slop>0, so setting ps=0 > would > > probably also help. > > Michael > > > > On Fri, Feb 8, 2019 at 5:57 AM Markus Jelsma <markus.jel...@openindex.io > > > > wrote: > > > > > Hello (apologies for cross-posting), > > > > > > While working on SOLR-12743, using 7.6 on two nodes and 7.2.1 on the > > > remaining four, we stumbled upon a situation where the 7.6 nodes > quickly > > > succumb when a 'Query-of-Death' is issued, 7.2.1 up to 7.5 are all > > > unaffected (tested and confirmed). > > > > > > Following Smiley's suggestion i used Eclipse MAT to find the problem in > > > the heap dump i obtained, this fantastic tool revealed within minutes > that > > > a query thread ate 65 % of all resources, in the class variables i > could > > > find the the query, and reproduce the problem. > > > > > > The problematic query is 'dubbele dijk/rijke dijkproject in het > dijktracé > > > eemshaven-delfzijl', on 7.6 this input produces a 40+ MB toString() > output > > > in edismax' newFieldQuery. If the node survives it takes 2+ seconds > for the > > > query to run (150 ms otherwise). If i disable all query time > > > SynonymGraphFilters it still takes a second and produces just a 9 MB > > > toString() for the query. > > > > > > I could not find anything like this in Jira. I did think of LUCENE-8479 > > > and LUCENE-8531 but they were about graphs, this problem looked related > > > though. > > > > > > I think i tracked it further down to LUCENE-8589 or SOLR-12243. When i > > > leave Solr's edismax' pf parameter empty, everything runs fast. When > all > > > fields are configured for pf, the node dies. > > > > > > I am now unsure whether this is a Solr or a Lucene issue. > > > > > > Please let me know. > > > > > > Many thanks, > > > Markus > > > > > > ps. in Solr i even got an 'Impossible Exception', my first! > > > > > >