I've tweaked what was there - some of the info was a bit ... fugly ... so I kept the main point about optimize being useful for a static index, added a couple other bits, and it's open for someone else to add back anything else they might think is useful but was cut.
On Nov 5, 2011, at 8:42 AM, Mark Miller wrote: > +1 - this is good info. The first thing it says is "You may want to optimize > an index whenever practical -- ie: if you build your index once, and then > never modify it." > > True stuff. At worst we should clarify practical a bit more - but I still > don't think it's bad as is. > > Then it gives you further good info. > > What is the point of removing it? > > - Mark > > On Nov 4, 2011, at 1:57 PM, Chris Hostetter wrote: > >> >> Completley removing all of this info seems like more harm then good -- it >> actually advises against doing an optimize except when you know you're >> never going to modify your index, and it explains the downsides of >> optimizing. >> >> i would suggest we add most of this back, but perhaps change the title >> (since many pieces of info in this section aren't specific to >> optimizing, they're just about segments) and be more vigorous in warning >> about the costs of optimize. >> >> : The "SolrPerformanceFactors" page has been changed by RobertMuir: >> : >> http://wiki.apache.org/solr/SolrPerformanceFactors?action=diff&rev1=28&rev2=29 >> : >> : Comment: >> : die optimize die >> : >> : >> : * Memory usage during indexing >> : * Segment merge time >> : - * Optimization times >> : * Index size >> : >> : These impacts can be reduced by the use of `omitNorms="true"` >> : @@ -74, +73 @@ >> : >> : === Explicit Warming of Sort Fields === >> : >> : If you do a lot of field based sorting, it is advantageous to add >> explicitly warming queries to the "newSearcher" and "firstSearcher" event >> listeners in your solrconfig which sort on those fields, so the !FieldCache >> is populated prior to any queries being executed by your users. >> : - >> : - == Optimization Considerations == >> : - >> : - You may want to optimize an index whenever practical -- ie: if you build >> your index once, and then never modify it. >> : - >> : - If your index is receiving a steady stream of modifications, then >> consider the following factors... >> : - >> : - * As more segments are added to the index, query performace will >> degrade slightly. Automatic segment merging by Lucene will set an upper >> bound on the number of segments created though. >> : - * Auto-warming time will grow since it's normally dependent on doing >> searches. >> : - * The first distribution after an optimization will take longer than >> subsequent ones. See [[CollectionDistribution|Collection Distribution]] for >> more information. >> : - * During optimization the file size of the index doubles, but returns >> to it's original size or even slightly less. >> : - * If you can, make sure that you do not have multiple concurrent >> producers of documents calling commit(). Multiple concurrent commits will >> cause a large performance degradation. >> : - >> : - Since optimizing an index saves all the segments in an index (about 7 >> files per segment) into a single segment, optimizing an index helps avoid >> the "too many open files" problem, i.e. running out of file descriptors, >> which is mentioned in an >> [[http://www.onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed|ONJava >> Article]]. >> : >> : == Updates and Commit Frequency Tradeoffs == >> : >> : >> >> -Hoss >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > - Mark Miller > lucidimagination.com > > > > > > > > > > > - Mark Miller lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
