[
https://issues.apache.org/jira/browse/SOLR-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908063#comment-14908063
]
Yonik Seeley commented on SOLR-8096:
------------------------------------
Once again, UnInvertedField was not part of the lucene FieldCache. It was a
Solr class cached in SolrIndexSearcher (via fieldValueCache), did not implement
the DocValues API, etc. The *lucene* FieldCache was made package protected (an
implementation detail) so one would need to access it via DocValues. That's
what the issue was about.
bq. So the committers decided to step forward and remove the top-level
facetting (which was long overdue).
Where was this discussion? I see nothing about it on LUCENE-5666
And of course I would have given a -1 to such a change for being dogmatic over
practical and not caring about our users.
bq. I was informed about the changes mentioned here
Where did this discussion take place? I can't find it in any public forum.
bq. I was always in favour of removing those top-level facetting algorithms. So
they still have my strong +1.
With no benchmarking of how the replacement performed? No option to use the
old method if a user *wanted* to? Without any public discussion of the impacts?
Without any note in Solr's CHANGES?
So you were strongly for the change, but you knew I'd most likely be against
it, right (based on previous discussions about top-level data structures)?
> Major faceting performance regressions
> --------------------------------------
>
> Key: SOLR-8096
> URL: https://issues.apache.org/jira/browse/SOLR-8096
> Project: Solr
> Issue Type: Bug
> Affects Versions: 5.0, 5.1, 5.2, 5.3, Trunk
> Reporter: Yonik Seeley
> Priority: Critical
>
> Use of the highly optimized faceting that Solr had for multi-valued fields
> over relatively static indexes was *secretly removed* as part of LUCENE-5666,
> causing severe performance regressions.
> Here are some quick benchmarks to gauge the damage, on a 5M document index,
> with each field having between 0 and 5 values per document. *Higher numbers
> represent worse 5x performance*.
> Solr 5.4_dev faceting time as a percent of Solr 4.10.3 faceting time
> ||...................................|| Percent of index being faceted
> ||num_unique_values|| 10% || 50% || 90% ||
> |10 | 351.17% | 1587.08% | 3057.28% |
> |100 | 158.10% | 203.61% | 1421.93% |
> |1000 | 143.78% | 168.01% | 1325.87% |
> |10000 | 137.98% | 175.31% | 1233.97% |
> |100000 | 142.98% | 159.42% | 1252.45% |
> |1000000 | 255.15% | 165.17% | 1236.75% |
> For example, a field with 1000 unique values in the whole index, faceting
> with 5x took 143% of the 4x time, when ~10% of the docs in the index were
> faceted.
> One user who brought the performance problem to our attention:
> http://markmail.org/message/ekmqh4ocbkwxv3we
> "faceting is unusable slow since upgrade to 5.3.0" (from 4.10.3)
> The disabling of the UnInvertedField algorithm was previously discovered in
> SOLR-7190, but we didn't know just how bad the problem was at that time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]