Hello Tim, Since 5.3 you can try to add {!join ... score=none} local parameter, it switches the algorithm to Lucene's JoinUtil which complexity depends from number of docs under "from" side.
On Sun, Aug 14, 2016 at 10:13 PM, Tim Frey <tfr...@gmail.com> wrote: > Hi there. I'm trying to fix a performance problem I have with queries that > use Solr's Join feature. The query is intended to find all Job > Applications that have an Interview in a particular state. There are 20 > million Job Applications and around 7 million Interviews, with 1 million > Interviews in the state I'm looking for. With all other filters applied, > the total result set is around 5000 documents. The query takes around 10 > seconds. > > After reading up on how Joins are essentially just subqueries, I understand > why my original approach would be slow. However, when I add another > restriction for the "inner query" to a single Job Application the entire > query still takes around 5 seconds. In this case, the inner query matches > 2 documents and the total result set size is 1 document (as expected.) > > Here's the debug output: > https://gist.github.com/tfrey7/50cd92c98e767ec612cc98bf430b9931 > > I'm using Solr 4.10. All documents are in the same index. The ID columns > are dynamic integer fields (because we're using the Sunspot ruby library, > exactly like: > https://github.com/sunspot/sunspot/blob/master/sunspot_ > solr/solr/solr/configsets/sunspot/conf/schema.xml#L179 > ) > > Is there something obviously wrong with the query that I'm making? Can > query-time Joins ever work for a scenario like this? > > Thanks! > -- Sincerely yours Mikhail Khludnev