At a certain point the index size doesn’t matter. When you re index a document 
you do not delete the actual residing document, you mark it as deleted and add 
on the replacement.  An optimize is what removes the marked deleted files, but 
an optimize is really no longer a recommended process since solr is very good 
at merging as well as the fact disk is inexpensive.  The reason the index 
increased in guessing is that even though it’s only indexed, that data is still 
stored and of course duplicated.  If it’s performance has not been adversely 
effected I would not ever run the optimize command. I’ve pushed an index that 
is naturally 450gb all the way to 800gb+ and it ran great, assuming you have 
the disk space available 

> On May 18, 2021, at 12:37 PM, Kudrettin Güleryüz <[email protected]> wrote:
> 
> Hello,
> 
> Experimenting with optimizing the index size.
> 
> Can you help me understand why indexing but not storing a file 10,000
> increases the index size by 2,500 times? 7.3 here. Schema and all other
> conditions are kept constant.
> 
> Thanks

Reply via email to