[ 
https://issues.apache.org/jira/browse/SOLR-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907568#comment-14907568
 ] 

Yonik Seeley edited comment on SOLR-8096 at 9/26/15 12:28 PM:
--------------------------------------------------------------

bq. Are you sure it was secret and not just a mistake?

Yes.

- This algorithm had been relied apon by many since 2008 (SOLR-475), and 
completely removing it's use and replacing it would obviously warrant 
discussion, benchmarks, etc.
- This was a massive patch, and relevant changes should be called out, esp if 
changes seem unrelated to the issue's description.
- If you search the JIRA issue, "UnInvertedField" *never* appears.
  (the linked issues mention it now, but those were added by us after the fact)
- The issue's title is "Add UninvertingReader" and the description had to do 
with Lucene's FieldCache, which UnInvertedField is not part of.
- There is *no* mention of the issue or changes anywhere in Solr's CHANGES.txt
- When asked to comment on impacts of this massive patch, the answer given was 
"Is the CHANGES.txt entry not good here? The docvalues apis did not change..."
- The CHANGES entry for lucene made no mention of the change to Solr or 
UnInvertedField.
- Although the UnInvertedField code was left behind (as dead code), the removal 
of the use
  of UnInvertedField was *not* by mistake - you can see by the test code that 
was explicitly removed.
  (TestFaceting.java)

edit: removed inflammatory conclusion 


was (Author: [email protected]):
bq. Are you sure it was secret and not just a mistake?

Yes.

- This algorithm had been relied apon by many since 2008 (SOLR-475), and 
completely removing it's use and replacing it would obviously warrant 
discussion, benchmarks, etc.
- This was a massive patch, and relevant changes should be called out, esp if 
changes seem unrelated to the issue's description.
- If you search the JIRA issue, "UnInvertedField" *never* appears.
  (the linked issues mention it now, but those were added by us after the fact)
- The issue's title is "Add UninvertingReader" and the description had to do 
with Lucene's FieldCache, which UnInvertedField is not part of.
- There is *no* mention of the issue or changes anywhere in Solr's CHANGES.txt
- When asked to comment on impacts of this massive patch, the answer given was 
"Is the CHANGES.txt entry not good here? The docvalues apis did not change..."
- The CHANGES entry for lucene made no mention of the change to Solr or 
UnInvertedField.
- Although the UnInvertedField code was left behind (as dead code), the removal 
of the use
  of UnInvertedField was *not* by mistake - you can see by the test code that 
was explicitly removed.
  (TestFaceting.java)

Exactly what other conclusion is there to draw?  Massive incompetence?

> Major faceting performance regressions
> --------------------------------------
>
>                 Key: SOLR-8096
>                 URL: https://issues.apache.org/jira/browse/SOLR-8096
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 5.0, 5.1, 5.2, 5.3, Trunk
>            Reporter: Yonik Seeley
>            Priority: Critical
>
> Use of the highly optimized faceting that Solr had for multi-valued fields 
> over relatively static indexes was removed as part of LUCENE-5666, causing 
> severe performance regressions.
> Here are some quick benchmarks to gauge the damage, on a 5M document index, 
> with each field having between 0 and 5 values per document.  *Higher numbers 
> represent worse 5x performance*.
> Solr 5.4_dev faceting time as a percent of Solr 4.10.3 faceting time          
> ||...................................|| Percent of index being faceted
> ||num_unique_values|| 10%     || 50% || 90% ||
> |10           | 351.17%       | 1587.08%      | 3057.28% |
> |100          | 158.10%       | 203.61%       | 1421.93% |
> |1000 | 143.78%       | 168.01%       | 1325.87% |
> |10000        | 137.98%       | 175.31%       | 1233.97% |
> |100000       | 142.98%       | 159.42%       | 1252.45% |
> |1000000      | 255.15%       | 165.17%       | 1236.75% |
> For example, a field with 1000 unique values in the whole index, faceting 
> with 5x took 143% of the 4x time, when ~10% of the docs in the index were 
> faceted.
> One user who brought the performance problem to our attention: 
> http://markmail.org/message/ekmqh4ocbkwxv3we
> "faceting is unusable slow since upgrade to 5.3.0" (from 4.10.3)
> The disabling of the UnInvertedField algorithm was previously discovered in 
> SOLR-7190, but we didn't know just how bad the problem was at that time.
> edit: removed "secret" adverb by request



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to