One of the things that would be interesting would be to analyze the QTimes
for individual queries from the logs for these runs. If you ship me the log
files I can take a look. I'll also be posting a branch with new command
line tool for posting logs to be indexed in Solr tomorrow and you can take
a look at that.

And the profiler is probably the only way to know for sure what's happening
here.





Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Dec 18, 2019 at 7:37 PM Erick Erickson <erickerick...@gmail.com>
wrote:

> The very short form is that from Solr 6.6.1 to Solr 8.3.1, the throughput
> for date boosting in my tests dropped by 40+%
>
> I’ve been hearing about slowdowns in successive Solr releases with boost
> functions, so I dug into it a bit. The test setup is just a boost-by-date
> with an additional big OR clause of 100 random words so I’d be sure to hit
> a bunch of docs. I figured that if there were few hits, the signal would be
> lost in the noise, but I didn’t look at the actual hit counts.
>
> I saw several Solr JIRAs about this subject, but they were slightly
> different, although quite possibly the same underlying issue. So I tried to
> get this down to a very specific form of a query.
>
> I’ve also seen some cases in the wild where the response was proportional
> to the number of segments, thus my optimize experiments.
>
> Here are the results, explanation below. O stands for optimized to one
> segment. I spot checked pdate against 7x and 8x and they weren’t
> significantly different performance wise from tdate. All have docValues
> enabled. I ran these against a multiValued=“false” field. All the tests
> pegged all my CPUs. Jmeter is being run on a different machine than Solr.
> Only one Solr was running for any test.
>
> Solr version   queries/min
> 6.6.1              3,400
> 6.6.1 O           4,800
>
> 7.1                 2,800
> 7.1 O             4,200
>
> 7.7.1              2,400
> 7.7.1 O          3,500
>
> 8.3.1             2,000
> 8.3.1 O          2,600
>
>
> The tests I’ve been running just index 20M docs into a single core, then
> run the exact same 10,000 queries against them from jmeter with 24 threads.
> Spot checks showed no hits on the queryResultCache.
>
> A query looks like this:
> rows=0&{!boost b=recip(ms(NOW,
> INSERT_FIELD_HERE),3.16e-11,1,1)}text_txt:(campaigners OR adjourned OR
> anyplace…97 more random words)
>
> There is no faceting. No grouping. No sorting.
>
> I fill in INSERT_FIELD_HERE through jmeter magic. I’m running the exact
> same queries for every test.
>
> One wildcard is that I did regenerate the index for each major revision,
> and the chose random words from the same list of words, as well as random
> times (bounded in the same range though) so the docs are not completely
> identical. The index was in the native format for that major version even
> if slightly different between versions. I ran the test once, then ran it
> again after optimizing the index.
>
> I haven’t dug any farther, if anyone’s interested I can throw a profiler
> at, say, 8.3 and see what I can see, although I’m not going to have time to
> dive into this any time soon. I’d be glad to run some tests though. I saved
> the queries and the indexes so running a test would  only take a few
> minutes.
>
> While I concentrated on date fields, the docs have date, int, and long
> fields, both docValues=true and docValues=false, each variant with
> multiValued=true and multiValued=false and both Trie and Point (where
> possible) variants as well as a pretty simple text field.
>
> Erick
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

Reply via email to