Thanks, Markus, that is useful. I'm guessing the higher the weight, the longer the op takes?
On Tue, Apr 1, 2014 at 10:39 PM, Markus Jelsma <markus.jel...@openindex.io>wrote: > You may want to increase reclaimdeletesweight for tieredmergepolicy from 2 > to 3 or 4. By default it may keep too much deleted or updated docs in the > index. This can increase index size by 50%!! Dmitry Kan < > solrexp...@gmail.com> schreef:Elisabeth, > > Yes, I believe you are right in that the deletes are part of the optimize > process. If you delete often, you may consider (if not already) the > TieredMergePolicy, which is suited for this scenario. Check out this > relevant discussion I had with Lucene committers: > https://twitter.com/DmitryKan/status/399820408444051456 > > HTH, > > Dmitry > > > On Tue, Apr 1, 2014 at 11:34 AM, elisabeth benoit < > elisaelisael...@gmail.com > > wrote: > > > Thanks a lot for your answers! > > > > Shawn. Our GC configuration has far less parameters defined, so we'll > check > > this out. > > > > Dimitry, about the expungeDeletes option, we'll add that in the delete > > process. But from what I read, this is done in the optimize process (cf. > > > > > http://lucene.472066.n3.nabble.com/Does-expungeDeletes-need-calling-during-an-optimize-td1214083.html > > ). > > Or maybe not? > > > > Thanks again, > > Elisabeth > > > > > > 2014-04-01 7:52 GMT+02:00 Dmitry Kan <solrexp...@gmail.com>: > > > > > Hi, > > > > > > We have noticed something like this as well, but with older versions of > > > solr, 3.4. In our setup we delete documents pretty often. Internally in > > > Lucene, when a document is client requested to be deleted, it is not > > > physically deleted, but only marked as "deleted". Our original > > optimization > > > assumption was such that the "deleted" documents would get physically > > > removed on each optimize command issued. We started to suspect it > wasn't > > > always true as the shards (especially relatively large shards) became > > > slower over time. So we found out about the expungeDeletes option, > which > > > purges the "deleted" docs and is by default false. We have set it to > > true. > > > If your solr update lifecycle includes frequent deletes, try this out. > > > > > > This of course does not override working towards finding better > > > GCparameters. > > > > > > > > > https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching > > > > > > > > > On Mon, Mar 31, 2014 at 3:57 PM, elisabeth benoit < > > > elisaelisael...@gmail.com > > > > wrote: > > > > > > > Hello, > > > > > > > > We are currently using solr 4.2.1. Our index is updated on a daily > > basis. > > > > After noticing solr query time has increased (two times the initial > > size) > > > > without any change in index size or in solr configuration, we tried > an > > > > optimize on the index but it didn't fix our problem. We checked the > > > garbage > > > > collector, but everything seemed fine. What did in fact fix our > problem > > > was > > > > to delete all documents and reindex from scratch. > > > > > > > > It looks like over time our index gets "corrupted" and optimize > doesn't > > > fix > > > > it. Does anyone have a clue how to investigate further this > situation? > > > > > > > > > > > > Elisabeth > > > > > > > > > > > > > > > > -- > > > Dmitry > > > Blog: http://dmitrykan.blogspot.com > > > Twitter: http://twitter.com/dmitrykan > > > > > > > > > -- > Dmitry > Blog: http://dmitrykan.blogspot.com > Twitter: http://twitter.com/dmitrykan > -- Dmitry Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan