Thanks Shawn and Erick.
So far I haven't noticed any performance issues before and after the change.
My concern all along is COST. We could have left the configuration as is -
keeping the deleting documents in the index - But we have to scale up our
Solr cluster. This will double our Solr
"Some large segments were merged into 12GB segments and
deleted documents were physically removed.”
and
“So with the current natural merge strategy, I need to update solrconfig.xml
and increase the maxMergedSegmentMB often"
I strongly recommend you do not continue down this path. You’re making a
On 10/25/2020 11:22 PM, Moulay Hicham wrote:
I am wondering about 3 other things:
1 - You mentioned that I need free disk space. Just to make sure that we
are talking about disc space here. RAM can still remain at the same size?
My current RAM size is Index size < RAM < 1.5 Index size
You
Thanks so much for clarifying. I have deployed the change to prod and seems
to be working. Some large segments were merged into 12GB segments and
deleted documents were physically removed.
I am wondering about 3 other things:
1 - You mentioned that I need free disk space. Just to make sure that
Well, you mentioned that the segments you’re concerned were merged a year ago.
If segments aren’t being merged, they’re pretty static.
There’s no real harm in optimizing _occasionally_, even in an NRT index. If you
have
segments that were merged that long ago, you may be indexing continually but
Thanks Eric.
My index is near real time and frequently updated.
I checked this page
https://lucene.apache.org/solr/guide/8_1/uploading-data-with-index-handlers.html#xml-update-commands
and using forceMerge/expungeDeletes are NOT recommended.
So I was hoping that the change in mergePolicyFactory
Just go ahead and optimize/forceMerge, but do _not_ optimize to one
segment. Or you can expungeDeletes, that will rewrite all segments with
more than 10% deleted docs. As of Solr 7.5, these operations respect the 5G
limit.
See: https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/
Hi,
I am using solr 8.1 in production. We have about 30%-50% of deleted
documents in some old segments that were merged a year ago.
These segments size is about 5GB.
I was wondering why these segments have a high % of deleted docs and found
out that they are NOT being candidates for merging