[jira] [Commented] (SOLR-3141) Deprecate OPTIMIZE command in Solr

Uwe Schindler (Commented) (JIRA) Sun, 19 Feb 2012 08:11:02 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13211394#comment-13211394
 ]


Uwe Schindler commented on SOLR-3141:
-------------------------------------

I just repeat here, what Mike already posted on the Lucene issue:

{quote}
Some quick googling uncovers depressing examples of over-optimizing:

* https://jira.duraspace.org/browse/FCREPO-155
* 
http://stackoverflow.com/questions/3912253/is-it-mandatory-to-optimize-the-lucene-index-after-write
* http://issues.liferay.com/browse/LPS-2944
* http://download.oracle.com/docs/cd/E19316-01/820-7054/girqf/index.html
* https://issues.sonatype.org/browse/MNGECLIPSE-2359
* http://blog.inflinx.com/tag/lucene

That last one has this fun comment:

{code:java}
// Lucene recommends calling optimize upon completion of indexing 
writer.optimize();
{code}
{quote}

Most of the above items also affect Solr. E.g. the first one (I know people 
from FIZ Karlsruhe and Fedora) is really funny. Fedora GSearch calls 
optimze=true on every add of a single document to Solr. I even know people 
using Solr and complained about GSearch because of this.

We can fix those horrible user-code bugs very fast by making optimize a no-op 
in Solr, they all will appreciate that. I just repeat: Nobody's installation 
would break, it would just get faster.

Some funny detail: With Lucene 3.x, search actuall gets faster with multiple 
segments if you do parallel ExceutorService-based search (I still dont really 
recommend to use ExceutorService on IndexSearcher...). On the other hand by 
executing the search on a non-optimized pre 2.9 index with no per segment 
search was really slower, as MultiTermsEnum and MultiDocsEnum was used.

With Lucene 3.x there is really no slowdown at all caused by multiple segments, 
as each segment is searched on its own with no interaction and just the results 
added to same priority queue. I agree, Solr has some problems with facetting, 
but people should use per-segment facetting and not optimize, this would 
improve their installations immense (although the actual facetting might get 
slower, but on the other hand FieldCaches can be reused, so it actually gets 
faster). The current default is global facetting and (for most installations) 
"optimize on *every* single item added" (see above links).
                
> Deprecate OPTIMIZE command in Solr
> ----------------------------------
>
>                 Key: SOLR-3141
>                 URL: https://issues.apache.org/jira/browse/SOLR-3141
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>    Affects Versions: 3.5
>            Reporter: Jan Høydahl
>              Labels: force, optimize
>             Fix For: 3.6
>
>
> Background: LUCENE-3454 renames optimize() as forceMerge(). Please read that 
> issue first.
> Now that optimize() is rarely necessary anymore, and renamed in Lucene APIs, 
> what should be done with Solr's ancient optimize command?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-3141) Deprecate OPTIMIZE command in Solr

Reply via email to