Thanks, Markus, that is useful.
I'm guessing the higher the weight, the longer the op takes?


On Tue, Apr 1, 2014 at 10:39 PM, Markus Jelsma
<markus.jel...@openindex.io>wrote:

> You may want to increase reclaimdeletesweight for tieredmergepolicy from 2
> to 3 or 4. By default it may keep too much deleted or updated docs in the
> index. This can increase index size by 50%!! Dmitry Kan <
> solrexp...@gmail.com> schreef:Elisabeth,
>
> Yes, I believe you are right in that the deletes are part of the optimize
> process. If you delete often, you may consider (if not already) the
> TieredMergePolicy, which is suited for this scenario. Check out this
> relevant discussion I had with Lucene committers:
> https://twitter.com/DmitryKan/status/399820408444051456
>
> HTH,
>
> Dmitry
>
>
> On Tue, Apr 1, 2014 at 11:34 AM, elisabeth benoit <
> elisaelisael...@gmail.com
> > wrote:
>
> > Thanks a lot for your answers!
> >
> > Shawn. Our GC configuration has far less parameters defined, so we'll
> check
> > this out.
> >
> > Dimitry, about the expungeDeletes option, we'll add that in the delete
> > process. But from what I read, this is done in the optimize process (cf.
> >
> >
> http://lucene.472066.n3.nabble.com/Does-expungeDeletes-need-calling-during-an-optimize-td1214083.html
> > ).
> > Or maybe not?
> >
> > Thanks again,
> > Elisabeth
> >
> >
> > 2014-04-01 7:52 GMT+02:00 Dmitry Kan <solrexp...@gmail.com>:
> >
> > > Hi,
> > >
> > > We have noticed something like this as well, but with older versions of
> > > solr, 3.4. In our setup we delete documents pretty often. Internally in
> > > Lucene, when a document is client requested to be deleted, it is not
> > > physically deleted, but only marked as "deleted". Our original
> > optimization
> > > assumption was such that the "deleted" documents would get physically
> > > removed on each optimize command issued. We started to suspect it
> wasn't
> > > always true as the shards (especially relatively large shards) became
> > > slower over time. So we found out about the expungeDeletes option,
> which
> > > purges the "deleted" docs and is by default false. We have set it to
> > true.
> > > If your solr update lifecycle includes frequent deletes, try this out.
> > >
> > > This of course does not override working towards finding better
> > > GCparameters.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching
> > >
> > >
> > > On Mon, Mar 31, 2014 at 3:57 PM, elisabeth benoit <
> > > elisaelisael...@gmail.com
> > > > wrote:
> > >
> > > > Hello,
> > > >
> > > > We are currently using solr 4.2.1. Our index is updated on a daily
> > basis.
> > > > After noticing solr query time has increased (two times the initial
> > size)
> > > > without any change in index size or in solr configuration, we tried
> an
> > > > optimize on the index but it didn't fix our problem. We checked the
> > > garbage
> > > > collector, but everything seemed fine. What did in fact fix our
> problem
> > > was
> > > > to delete all documents and reindex from scratch.
> > > >
> > > > It looks like over time our index gets "corrupted" and optimize
> doesn't
> > > fix
> > > > it. Does anyone have a clue how to investigate further this
> situation?
> > > >
> > > >
> > > > Elisabeth
> > > >
> > >
> > >
> > >
> > > --
> > > Dmitry
> > > Blog: http://dmitrykan.blogspot.com
> > > Twitter: http://twitter.com/dmitrykan
> > >
> >
>
>
>
> --
> Dmitry
> Blog: http://dmitrykan.blogspot.com
> Twitter: http://twitter.com/dmitrykan
>



-- 
Dmitry
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan

Reply via email to