+1 - this is good info. The first thing it says is "You may want to optimize an index whenever practical -- ie: if you build your index once, and then never modify it."
True stuff. At worst we should clarify practical a bit more - but I still don't think it's bad as is. Then it gives you further good info. What is the point of removing it? - Mark On Nov 4, 2011, at 1:57 PM, Chris Hostetter wrote: > > Completley removing all of this info seems like more harm then good -- it > actually advises against doing an optimize except when you know you're > never going to modify your index, and it explains the downsides of > optimizing. > > i would suggest we add most of this back, but perhaps change the title > (since many pieces of info in this section aren't specific to > optimizing, they're just about segments) and be more vigorous in warning > about the costs of optimize. > > : The "SolrPerformanceFactors" page has been changed by RobertMuir: > : > http://wiki.apache.org/solr/SolrPerformanceFactors?action=diff&rev1=28&rev2=29 > : > : Comment: > : die optimize die > : > : > : * Memory usage during indexing > : * Segment merge time > : - * Optimization times > : * Index size > : > : These impacts can be reduced by the use of `omitNorms="true"` > : @@ -74, +73 @@ > : > : === Explicit Warming of Sort Fields === > : > : If you do a lot of field based sorting, it is advantageous to add > explicitly warming queries to the "newSearcher" and "firstSearcher" event > listeners in your solrconfig which sort on those fields, so the !FieldCache > is populated prior to any queries being executed by your users. > : - > : - == Optimization Considerations == > : - > : - You may want to optimize an index whenever practical -- ie: if you build > your index once, and then never modify it. > : - > : - If your index is receiving a steady stream of modifications, then > consider the following factors... > : - > : - * As more segments are added to the index, query performace will > degrade slightly. Automatic segment merging by Lucene will set an upper > bound on the number of segments created though. > : - * Auto-warming time will grow since it's normally dependent on doing > searches. > : - * The first distribution after an optimization will take longer than > subsequent ones. See [[CollectionDistribution|Collection Distribution]] for > more information. > : - * During optimization the file size of the index doubles, but returns > to it's original size or even slightly less. > : - * If you can, make sure that you do not have multiple concurrent > producers of documents calling commit(). Multiple concurrent commits will > cause a large performance degradation. > : - > : - Since optimizing an index saves all the segments in an index (about 7 > files per segment) into a single segment, optimizing an index helps avoid the > "too many open files" problem, i.e. running out of file descriptors, which is > mentioned in an > [[http://www.onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed|ONJava > Article]]. > : > : == Updates and Commit Frequency Tradeoffs == > : > : > > -Hoss > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > - Mark Miller lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
