On Mon, 2019-04-08 at 09:58 +1000, Ash Ramesh wrote: > We have a corpus of 50+ million documents in our collection. I've > noticed that some queries with specific keywords tend to be extremely > slow. > E.g. the q=`photography' or q='background'. After digging into the > raw documents, I could see that these two terms appear in greater > than 90% of all documents, which means solr has to score each of > those documents.
That is known behaviour, which can be remedied somewhat. Stop words is a common approach, but your samples does not seem to fit well with that. Instead you can look at Common Grams, where your high-frequency words gets concatenated with surrounding words. This only works with phrases though. There's a nice article at https://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-2 - Toke Eskildsen, Royal Danish Library