Ere:
This is an excellent summary, it conforms to what I think I know, it's
always nice to see confirmation!
I'd add two small enhancements. Your point 5 mentions sorting. The same
consideration is true for grouping and faceting as well. What all three
have in common is that they answer the
Hi Everyone,
This is a followup on the discussion from September 2017. Since then
I've spent a lot of time gathering a better understanding on docValues
compared to UIF and other stuff related to Solr performance. Here's a
summary of the results based on my real-world experience:
1. Making
Toke Eskildsen kirjoitti 5.9.2017 klo 13.49:
On Mon, 2017-09-04 at 11:03 -0400, Yonik Seeley wrote:
It's due to this (see comments in UnInvertedField):
I have read that. What I don't understand is the difference between 4.x
and 6.x. But as you say, Ere seems to be in the process of verifying
Yonik Seeley kirjoitti 4.9.2017 klo 18.03:
It's due to this (see comments in UnInvertedField):
* To further save memory, the terms (the actual string values) are
not all stored in
* memory, but a TermIndex is used to convert term numbers to term values only
* for the terms needed after
The number-of-segments noise probably swamps this... but one
optimization around deep-facet-paging that didn't get carried forward
is
https://issues.apache.org/jira/browse/SOLR-2092
-Yonik
On Tue, Sep 5, 2017 at 6:49 AM, Toke Eskildsen wrote:
> On Mon, 2017-09-04 at 11:03 -0400,
On Mon, 2017-09-04 at 11:03 -0400, Yonik Seeley wrote:
> It's due to this (see comments in UnInvertedField):
I have read that. What I don't understand is the difference between 4.x
and 6.x. But as you say, Ere seems to be in the process of verifying
whether this is simply due to more segments in
On Mon, Sep 4, 2017 at 6:38 AM, Toke Eskildsen wrote:
> On Mon, 2017-09-04 at 13:21 +0300, Ere Maijala wrote:
>> Thanks for the insight, Yonik. I can confirm that #2 is true. I ran
>>
>>
>>
>> and after it completed I was able to retrieve 2000 values in 17ms.
>
> Very interesting. Is
Toke Eskildsen kirjoitti 4.9.2017 klo 13.38:
On Mon, 2017-09-04 at 13:21 +0300, Ere Maijala wrote:
Thanks for the insight, Yonik. I can confirm that #2 is true. I ran
and after it completed I was able to retrieve 2000 values in 17ms.
Very interesting. Is this on spinning disks or SSD? Is
On Mon, 2017-09-04 at 13:21 +0300, Ere Maijala wrote:
> Thanks for the insight, Yonik. I can confirm that #2 is true. I ran
>
>
>
> and after it completed I was able to retrieve 2000 values in 17ms.
Very interesting. Is this on spinning disks or SSD? Is your index data
cached in memory? What I
Yonik Seeley kirjoitti 1.9.2017 klo 17.03:> On Fri, Sep 1, 2017 at 9:17
AM, Ere Maijala wrote:
>> I spoke a bit too soon. Now I see why I didn't see any improvement from
>> facet.method=uif before: its performance seems to depend heavily on
how many
>> facets are
On Fri, Sep 1, 2017 at 9:17 AM, Ere Maijala wrote:
> I spoke a bit too soon. Now I see why I didn't see any improvement from
> facet.method=uif before: its performance seems to depend heavily on how many
> facets are returned. With an index of 6 million records and the
On Fri, Sep 1, 2017 at 9:17 AM, Ere Maijala wrote:
> I spoke a bit too soon. Now I see why I didn't see any improvement from
> facet.method=uif before: its performance seems to depend heavily on how many
> facets are returned. With an index of 6 million records and the
I spoke a bit too soon. Now I see why I didn't see any improvement from
facet.method=uif before: its performance seems to depend heavily on how
many facets are returned. With an index of 6 million records and the
facet having 1960 buckets:
facet.limit=20 takes 4ms
facet.limit=200 takes ~100ms
Yonik, thanks for the hint with the uif facet method.
(btw: why isn't it part of the official documentation? - at least I
haven't found it)
For our use case it means:
Time for facet processing is exactly the same as it is with version 4.
But this works only for indexes 'without' docvalues
I
I can confirm that we're seeing the same issue as Günter. For a
collection of 57 million bibliographic records, Solr 4.10.2 (without
docValues) can consistently return a facet in about 20ms, while Solr
6.6.0 with docValues takes around 2600ms. I've tested some versions
between those two too,
A possible improvement for some multiValued fields might be to use the
"uif" facet method (UnInvertedField was the default method for
multiValued fields in 4.x)
I'm not sure if you would need to reindex without docValues on that
field to try it though.
Example: to enable on the "union" field, add
16 matches
Mail list logo