Neil Ireson created SOLR-6803:
---------------------------------
Summary: Pivot Performance
Key: SOLR-6803
URL: https://issues.apache.org/jira/browse/SOLR-6803
Project: Solr
Issue Type: Bug
Affects Versions: 4.10.2
Reporter: Neil Ireson
Priority: Minor
I found that my pivot search for terms per day was taking an age so I knocked
up a quick test, using a collection of 1 million documents with a different
number of random terms and times, to compare different ways of getting the
counts.
1) Combined = combining the term and time in a single field.
2) Facet = for each term set the query to the term and then get the time facet
3) Pivot = use the term/time pivot facet.
The following two tables present the results for version 4.9.1 vs 4.10.1, as an
average of five runs.
4.9.1
| Processing time in ms |
Values | Combined| Facet| Pivot|
100 | 22| 21| 52|
1000 | 178| 57| 115|
10000 | 1363| 211| 310|
100000 | 2592| 1009| 978|
500000 | 3125| 3753| 2476|
1000000 | 3957| 6789| 3725|
4.10.1
| Processing time in ms |
Values | Combined| Facet| Pivot|
100 | 21| 21| 75|
1000 | 188| 60| 265|
10000 | 1438| 215| 1826|
100000 | 2768| 1073| 16594|
500000 | 3266| 3686| 99682|
1000000 | 4080| 6777| 208873|
The results show that, as the number of pivot values increases (i.e. number of
terms * number of times), pivot performance in 4.10.1 get progressively worse.
I tried to look at the code but there was a lot of changes in pivoting between
4.9 and 4.10, and so it is not clear to me what has cause the performance
issues. However the results seem to indicate that if the pivot was simply a
combined facet search, it could potentially produce better and more robust
performance.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]