One of the things that would be interesting would be to analyze the QTimes for individual queries from the logs for these runs. If you ship me the log files I can take a look. I'll also be posting a branch with new command line tool for posting logs to be indexed in Solr tomorrow and you can take a look at that.
And the profiler is probably the only way to know for sure what's happening here. Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Dec 18, 2019 at 7:37 PM Erick Erickson <erickerick...@gmail.com> wrote: > The very short form is that from Solr 6.6.1 to Solr 8.3.1, the throughput > for date boosting in my tests dropped by 40+% > > I’ve been hearing about slowdowns in successive Solr releases with boost > functions, so I dug into it a bit. The test setup is just a boost-by-date > with an additional big OR clause of 100 random words so I’d be sure to hit > a bunch of docs. I figured that if there were few hits, the signal would be > lost in the noise, but I didn’t look at the actual hit counts. > > I saw several Solr JIRAs about this subject, but they were slightly > different, although quite possibly the same underlying issue. So I tried to > get this down to a very specific form of a query. > > I’ve also seen some cases in the wild where the response was proportional > to the number of segments, thus my optimize experiments. > > Here are the results, explanation below. O stands for optimized to one > segment. I spot checked pdate against 7x and 8x and they weren’t > significantly different performance wise from tdate. All have docValues > enabled. I ran these against a multiValued=“false” field. All the tests > pegged all my CPUs. Jmeter is being run on a different machine than Solr. > Only one Solr was running for any test. > > Solr version queries/min > 6.6.1 3,400 > 6.6.1 O 4,800 > > 7.1 2,800 > 7.1 O 4,200 > > 7.7.1 2,400 > 7.7.1 O 3,500 > > 8.3.1 2,000 > 8.3.1 O 2,600 > > > The tests I’ve been running just index 20M docs into a single core, then > run the exact same 10,000 queries against them from jmeter with 24 threads. > Spot checks showed no hits on the queryResultCache. > > A query looks like this: > rows=0&{!boost b=recip(ms(NOW, > INSERT_FIELD_HERE),3.16e-11,1,1)}text_txt:(campaigners OR adjourned OR > anyplace…97 more random words) > > There is no faceting. No grouping. No sorting. > > I fill in INSERT_FIELD_HERE through jmeter magic. I’m running the exact > same queries for every test. > > One wildcard is that I did regenerate the index for each major revision, > and the chose random words from the same list of words, as well as random > times (bounded in the same range though) so the docs are not completely > identical. The index was in the native format for that major version even > if slightly different between versions. I ran the test once, then ran it > again after optimizing the index. > > I haven’t dug any farther, if anyone’s interested I can throw a profiler > at, say, 8.3 and see what I can see, although I’m not going to have time to > dive into this any time soon. I’d be glad to run some tests though. I saved > the queries and the indexes so running a test would only take a few > minutes. > > While I concentrated on date fields, the docs have date, int, and long > fields, both docValues=true and docValues=false, each variant with > multiValued=true and multiValued=false and both Trie and Point (where > possible) variants as well as a pretty simple text field. > > Erick > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >