Re: TieredMergePolicyFactory question

2020-10-26 Thread Moulay Hicham
Thanks Shawn and Erick. So far I haven't noticed any performance issues before and after the change. My concern all along is COST. We could have left the configuration as is - keeping the deleting documents in the index - But we have to scale up our Solr cluster. This will double our Solr

Re: TieredMergePolicyFactory question

2020-10-26 Thread Erick Erickson
"Some large segments were merged into 12GB segments and deleted documents were physically removed.” and “So with the current natural merge strategy, I need to update solrconfig.xml and increase the maxMergedSegmentMB often" I strongly recommend you do not continue down this path. You’re making a

Re: TieredMergePolicyFactory question

2020-10-26 Thread Shawn Heisey
On 10/25/2020 11:22 PM, Moulay Hicham wrote: I am wondering about 3 other things: 1 - You mentioned that I need free disk space. Just to make sure that we are talking about disc space here. RAM can still remain at the same size? My current RAM size is Index size < RAM < 1.5 Index size You

Re: TieredMergePolicyFactory question

2020-10-25 Thread Moulay Hicham
Thanks so much for clarifying. I have deployed the change to prod and seems to be working. Some large segments were merged into 12GB segments and deleted documents were physically removed. I am wondering about 3 other things: 1 - You mentioned that I need free disk space. Just to make sure that

Re: TieredMergePolicyFactory question

2020-10-23 Thread Erick Erickson
Well, you mentioned that the segments you’re concerned were merged a year ago. If segments aren’t being merged, they’re pretty static. There’s no real harm in optimizing _occasionally_, even in an NRT index. If you have segments that were merged that long ago, you may be indexing continually but

Re: TieredMergePolicyFactory question

2020-10-23 Thread Moulay Hicham
Thanks Eric. My index is near real time and frequently updated. I checked this page https://lucene.apache.org/solr/guide/8_1/uploading-data-with-index-handlers.html#xml-update-commands and using forceMerge/expungeDeletes are NOT recommended. So I was hoping that the change in mergePolicyFactory

Re: TieredMergePolicyFactory question

2020-10-23 Thread Erick Erickson
Just go ahead and optimize/forceMerge, but do _not_ optimize to one segment. Or you can expungeDeletes, that will rewrite all segments with more than 10% deleted docs. As of Solr 7.5, these operations respect the 5G limit. See: https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/

TieredMergePolicyFactory question

2020-10-23 Thread Moulay Hicham
Hi, I am using solr 8.1 in production. We have about 30%-50% of deleted documents in some old segments that were merged a year ago. These segments size is about 5GB. I was wondering why these segments have a high % of deleted docs and found out that they are NOT being candidates for merging