Re: Multiple fq vs combined fq performance
All non-cached filters will be executed together (leapfrog between them) and will be sorted by the filter cost (I guess that, since you aren't setting a cost, then the order of the input matters). You can try setting a cost in your filters (lower than 100, so that they don't become post filters) One other thing though, I guess you are using Point fields? If you typically query for a single value like in this example (vs. ranges), you may want to use string fields for those. See https://issues.apache.org/jira/browse/SOLR-11078. On Fri, Jul 10, 2020 at 7:51 AM Chris Dempsey wrote: > Thanks for the suggestion, Alex. It doesn't appear that > IndexOrDocValuesQuery (at least in Solr 7.7.1) supports the PostFilter > interface. I've tried various values for cost on each of the fq and it > doesn't change the QTime. > > So, after digging around a bit even though > {!cache=false}taggedTickets_ticketId:100241 only matches one and only > one document in the collection that doesn't matter for the other two fq who > continue to look over the index of the collection, correct? > > On Thu, Jul 9, 2020 at 4:24 PM Alexandre Rafalovitch > wrote: > > > I _think_ it will run all 3 and then do index hopping. But if you know > one > > fq is super expensive, you could assign it a cost > > Value over 100 will try to use PostFilter then and apply the query on top > > of results from other queries. > > > > > > > > > https://lucene.apache.org/solr/guide/8_4/common-query-parameters.html#cache-parameter > > > > Hope it helps, > > Alex. > > > > On Thu., Jul. 9, 2020, 2:05 p.m. Chris Dempsey, > wrote: > > > > > Hi all! In a collection where we have ~54 million documents we've > noticed > > > running a query with the following: > > > > > > "fq":["{!cache=false}_class:taggedTickets", > > > "{!cache=false}taggedTickets_ticketId:100241", > > > "{!cache=false}companyId:22476"] > > > > > > when I debugQuery I see: > > > > > > "parsed_filter_queries":[ > > > "{!cache=false}_class:taggedTickets", > > > > "{!cache=false}IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 > > > TO 100241])", > > > "{!cache=false}IndexOrDocValuesQuery(companyId:[22476 TO 22476])" > > > ] > > > > > > runs in roughly ~450ms but if we remove `{!cache=false}companyId:22476` > > it > > > drops down to ~5ms (it's important to note that > `taggedTickets_ticketId` > > is > > > globally unique). > > > > > > If we change the fqs to: > > > > > > "fq":["{!cache=false}_class:taggedTickets", > > > "{!cache=false}+companyId:22476 > > +taggedTickets_ticketId:100241"] > > > > > > when I debugQuery I see: > > > > > > "parsed_filter_queries":[ > > >"{!cache=false}_class:taggedTickets", > > >"{!cache=false}+IndexOrDocValuesQuery(companyId:[22476 TO 22476]) > > > +IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 TO > > 100241])" > > > ] > > > > > > we get the correct result back in ~5ms. > > > > > > My current thought is that in the slow scenario Solr is still running > > > `{!cache=false}IndexOrDocValuesQuery(companyId:[22476 > > > TO 22476])` even though it "has the answer" from the first two fq. > > > > > > Am I off-base or misunderstanding how `fq` are processed? > > > > > >
Re: Multiple fq vs combined fq performance
Thanks for the suggestion, Alex. It doesn't appear that IndexOrDocValuesQuery (at least in Solr 7.7.1) supports the PostFilter interface. I've tried various values for cost on each of the fq and it doesn't change the QTime. So, after digging around a bit even though {!cache=false}taggedTickets_ticketId:100241 only matches one and only one document in the collection that doesn't matter for the other two fq who continue to look over the index of the collection, correct? On Thu, Jul 9, 2020 at 4:24 PM Alexandre Rafalovitch wrote: > I _think_ it will run all 3 and then do index hopping. But if you know one > fq is super expensive, you could assign it a cost > Value over 100 will try to use PostFilter then and apply the query on top > of results from other queries. > > > > https://lucene.apache.org/solr/guide/8_4/common-query-parameters.html#cache-parameter > > Hope it helps, > Alex. > > On Thu., Jul. 9, 2020, 2:05 p.m. Chris Dempsey, wrote: > > > Hi all! In a collection where we have ~54 million documents we've noticed > > running a query with the following: > > > > "fq":["{!cache=false}_class:taggedTickets", > > "{!cache=false}taggedTickets_ticketId:100241", > > "{!cache=false}companyId:22476"] > > > > when I debugQuery I see: > > > > "parsed_filter_queries":[ > > "{!cache=false}_class:taggedTickets", > > "{!cache=false}IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 > > TO 100241])", > > "{!cache=false}IndexOrDocValuesQuery(companyId:[22476 TO 22476])" > > ] > > > > runs in roughly ~450ms but if we remove `{!cache=false}companyId:22476` > it > > drops down to ~5ms (it's important to note that `taggedTickets_ticketId` > is > > globally unique). > > > > If we change the fqs to: > > > > "fq":["{!cache=false}_class:taggedTickets", > > "{!cache=false}+companyId:22476 > +taggedTickets_ticketId:100241"] > > > > when I debugQuery I see: > > > > "parsed_filter_queries":[ > >"{!cache=false}_class:taggedTickets", > >"{!cache=false}+IndexOrDocValuesQuery(companyId:[22476 TO 22476]) > > +IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 TO > 100241])" > > ] > > > > we get the correct result back in ~5ms. > > > > My current thought is that in the slow scenario Solr is still running > > `{!cache=false}IndexOrDocValuesQuery(companyId:[22476 > > TO 22476])` even though it "has the answer" from the first two fq. > > > > Am I off-base or misunderstanding how `fq` are processed? > > >
Re: Multiple fq vs combined fq performance
I _think_ it will run all 3 and then do index hopping. But if you know one fq is super expensive, you could assign it a cost Value over 100 will try to use PostFilter then and apply the query on top of results from other queries. https://lucene.apache.org/solr/guide/8_4/common-query-parameters.html#cache-parameter Hope it helps, Alex. On Thu., Jul. 9, 2020, 2:05 p.m. Chris Dempsey, wrote: > Hi all! In a collection where we have ~54 million documents we've noticed > running a query with the following: > > "fq":["{!cache=false}_class:taggedTickets", > "{!cache=false}taggedTickets_ticketId:100241", > "{!cache=false}companyId:22476"] > > when I debugQuery I see: > > "parsed_filter_queries":[ > "{!cache=false}_class:taggedTickets", > "{!cache=false}IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 > TO 100241])", > "{!cache=false}IndexOrDocValuesQuery(companyId:[22476 TO 22476])" > ] > > runs in roughly ~450ms but if we remove `{!cache=false}companyId:22476` it > drops down to ~5ms (it's important to note that `taggedTickets_ticketId` is > globally unique). > > If we change the fqs to: > > "fq":["{!cache=false}_class:taggedTickets", > "{!cache=false}+companyId:22476 +taggedTickets_ticketId:100241"] > > when I debugQuery I see: > > "parsed_filter_queries":[ >"{!cache=false}_class:taggedTickets", >"{!cache=false}+IndexOrDocValuesQuery(companyId:[22476 TO 22476]) > +IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 TO 100241])" > ] > > we get the correct result back in ~5ms. > > My current thought is that in the slow scenario Solr is still running > `{!cache=false}IndexOrDocValuesQuery(companyId:[22476 > TO 22476])` even though it "has the answer" from the first two fq. > > Am I off-base or misunderstanding how `fq` are processed? >
Multiple fq vs combined fq performance
Hi all! In a collection where we have ~54 million documents we've noticed running a query with the following: "fq":["{!cache=false}_class:taggedTickets", "{!cache=false}taggedTickets_ticketId:100241", "{!cache=false}companyId:22476"] when I debugQuery I see: "parsed_filter_queries":[ "{!cache=false}_class:taggedTickets", "{!cache=false}IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 TO 100241])", "{!cache=false}IndexOrDocValuesQuery(companyId:[22476 TO 22476])" ] runs in roughly ~450ms but if we remove `{!cache=false}companyId:22476` it drops down to ~5ms (it's important to note that `taggedTickets_ticketId` is globally unique). If we change the fqs to: "fq":["{!cache=false}_class:taggedTickets", "{!cache=false}+companyId:22476 +taggedTickets_ticketId:100241"] when I debugQuery I see: "parsed_filter_queries":[ "{!cache=false}_class:taggedTickets", "{!cache=false}+IndexOrDocValuesQuery(companyId:[22476 TO 22476]) +IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 TO 100241])" ] we get the correct result back in ~5ms. My current thought is that in the slow scenario Solr is still running `{!cache=false}IndexOrDocValuesQuery(companyId:[22476 TO 22476])` even though it "has the answer" from the first two fq. Am I off-base or misunderstanding how `fq` are processed?