[jira] [Commented] (SOLR-6803) Pivot Performance

Neil Ireson (JIRA) Thu, 11 Dec 2014 07:39:52 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242697#comment-14242697
 ]


Neil Ireson commented on SOLR-6803:
-----------------------------------

I'm not really sure what you are asking. The issue is the significant fall in 
performance for "1-level" pivots (which I imagine covers the vast majority of 
use cases), I'm not sure what is to be gained from looking at higher levels of 
pivots.

In general I would say that simply making repeated calls to the facet code 
would be the expected baseline performance of the pivot code, unless pivoting 
is providing some additional funky functionality. At the moment even when I'm 
making the calls to faceting from within my code and aggregating the results I 
am, in general (i.e. up to 100,000 values), getting better performance than 
when I use pivots, which, from my naive perspective, does not seem right.

> Pivot Performance
> -----------------
>
>                 Key: SOLR-6803
>                 URL: https://issues.apache.org/jira/browse/SOLR-6803
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.10.2
>            Reporter: Neil Ireson
>            Priority: Minor
>         Attachments: PivotPerformanceTest.java
>
>
> I found that my pivot search for terms per day was taking an age so I knocked 
> up a quick test, using a collection of 1 million documents with a different 
> number of random terms and times, to compare different ways of getting the 
> counts.
> 1) Combined = combining the term and time in a single field.
> 2) Facet = for each term set the query to the term and then get the time 
> facet 
> 3) Pivot = use the term/time pivot facet.
> The following two tables present the results for version 4.9.1 vs 4.10.1, as 
> an average of five runs.
> 4.9.1 (Processing time in ms)
> |Values (#)   |  Combined (ms)|     Facet (ms)|     Pivot (ms)|
> |100       |        22|        21|        52|
> |1000      |       178|        57|       115|
> |10000     |      1363|       211|       310|
> |100000    |      2592|      1009|       978|
> |500000    |      3125|      3753|      2476|
> |1000000   |      3957|      6789|      3725|
> 4.10.1 (Processing time in ms)
> |Values (#)   |  Combined (ms)|     Facet (ms)|     Pivot (ms)|
> |100       |        21|        21|        75|
> |1000      |       188|        60|       265|
> |10000     |      1438|       215|      1826|
> |100000    |      2768|      1073|     16594|
> |500000    |      3266|      3686|     99682|
> |1000000   |      4080|      6777|    208873|
> The results show that, as the number of pivot values increases (i.e. number 
> of terms * number of times), pivot performance in 4.10.1 get progressively 
> worse.
> I tried to look at the code but there was a lot of changes in pivoting 
> between 4.9 and 4.10, and so it is not clear to me what has cause the 
> performance issues. However the results seem to indicate that if the pivot 
> was simply a combined facet search, it could potentially produce better and 
> more robust performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6803) Pivot Performance

Reply via email to