Re: slow solr facet processing

2018-01-05 Thread Erick Erickson
Ere: This is an excellent summary, it conforms to what I think I know, it's always nice to see confirmation! I'd add two small enhancements. Your point 5 mentions sorting. The same consideration is true for grouping and faceting as well. What all three have in common is that they answer the

Re: slow solr facet processing

2018-01-05 Thread Ere Maijala
Hi Everyone, This is a followup on the discussion from September 2017. Since then I've spent a lot of time gathering a better understanding on docValues compared to UIF and other stuff related to Solr performance. Here's a summary of the results based on my real-world experience: 1. Making

Re: slow solr facet processing

2017-09-05 Thread Ere Maijala
Toke Eskildsen kirjoitti 5.9.2017 klo 13.49: On Mon, 2017-09-04 at 11:03 -0400, Yonik Seeley wrote: It's due to this (see comments in UnInvertedField): I have read that. What I don't understand is the difference between 4.x and 6.x. But as you say, Ere seems to be in the process of verifying

Re: slow solr facet processing

2017-09-05 Thread Ere Maijala
Yonik Seeley kirjoitti 4.9.2017 klo 18.03: It's due to this (see comments in UnInvertedField): * To further save memory, the terms (the actual string values) are not all stored in * memory, but a TermIndex is used to convert term numbers to term values only * for the terms needed after

Re: slow solr facet processing

2017-09-05 Thread Yonik Seeley
The number-of-segments noise probably swamps this... but one optimization around deep-facet-paging that didn't get carried forward is https://issues.apache.org/jira/browse/SOLR-2092 -Yonik On Tue, Sep 5, 2017 at 6:49 AM, Toke Eskildsen wrote: > On Mon, 2017-09-04 at 11:03 -0400,

Re: slow solr facet processing

2017-09-05 Thread Toke Eskildsen
On Mon, 2017-09-04 at 11:03 -0400, Yonik Seeley wrote: > It's due to this (see comments in UnInvertedField): I have read that. What I don't understand is the difference between 4.x and 6.x. But as you say, Ere seems to be in the process of verifying whether this is simply due to more segments in

Re: slow solr facet processing

2017-09-04 Thread Yonik Seeley
On Mon, Sep 4, 2017 at 6:38 AM, Toke Eskildsen wrote: > On Mon, 2017-09-04 at 13:21 +0300, Ere Maijala wrote: >> Thanks for the insight, Yonik. I can confirm that #2 is true. I ran >> >> >> >> and after it completed I was able to retrieve 2000 values in 17ms. > > Very interesting. Is

Re: slow solr facet processing

2017-09-04 Thread Ere Maijala
Toke Eskildsen kirjoitti 4.9.2017 klo 13.38: On Mon, 2017-09-04 at 13:21 +0300, Ere Maijala wrote: Thanks for the insight, Yonik. I can confirm that #2 is true. I ran and after it completed I was able to retrieve 2000 values in 17ms. Very interesting. Is this on spinning disks or SSD? Is

Re: slow solr facet processing

2017-09-04 Thread Toke Eskildsen
On Mon, 2017-09-04 at 13:21 +0300, Ere Maijala wrote: > Thanks for the insight, Yonik. I can confirm that #2 is true. I ran > > > > and after it completed I was able to retrieve 2000 values in 17ms. Very interesting. Is this on spinning disks or SSD? Is your index data cached in memory? What I

Re: slow solr facet processing

2017-09-04 Thread Ere Maijala
Yonik Seeley kirjoitti 1.9.2017 klo 17.03:> On Fri, Sep 1, 2017 at 9:17 AM, Ere Maijala wrote: >> I spoke a bit too soon. Now I see why I didn't see any improvement from >> facet.method=uif before: its performance seems to depend heavily on how many >> facets are

Re: slow solr facet processing

2017-09-01 Thread Yonik Seeley
On Fri, Sep 1, 2017 at 9:17 AM, Ere Maijala wrote: > I spoke a bit too soon. Now I see why I didn't see any improvement from > facet.method=uif before: its performance seems to depend heavily on how many > facets are returned. With an index of 6 million records and the

Re: slow solr facet processing

2017-09-01 Thread Yonik Seeley
On Fri, Sep 1, 2017 at 9:17 AM, Ere Maijala wrote: > I spoke a bit too soon. Now I see why I didn't see any improvement from > facet.method=uif before: its performance seems to depend heavily on how many > facets are returned. With an index of 6 million records and the

Re: slow solr facet processing

2017-09-01 Thread Ere Maijala
I spoke a bit too soon. Now I see why I didn't see any improvement from facet.method=uif before: its performance seems to depend heavily on how many facets are returned. With an index of 6 million records and the facet having 1960 buckets: facet.limit=20 takes 4ms facet.limit=200 takes ~100ms

Re: slow solr facet processing

2017-09-01 Thread Günter Hipler
Yonik, thanks for the hint with the uif facet method. (btw: why isn't it part of the official documentation? - at least I haven't found it) For our use case it means: Time for facet processing is exactly the same as it is with version 4. But this works only for indexes 'without' docvalues I

Re: slow solr facet processing

2017-09-01 Thread Ere Maijala
I can confirm that we're seeing the same issue as Günter. For a collection of 57 million bibliographic records, Solr 4.10.2 (without docValues) can consistently return a facet in about 20ms, while Solr 6.6.0 with docValues takes around 2600ms. I've tested some versions between those two too,

Re: slow solr facet processing

2017-08-31 Thread Yonik Seeley
A possible improvement for some multiValued fields might be to use the "uif" facet method (UnInvertedField was the default method for multiValued fields in 4.x) I'm not sure if you would need to reindex without docValues on that field to try it though. Example: to enable on the "union" field, add