On 6/22/2021 11:29 PM, Stephen Lewis Bianamara wrote:
This was a ton of great info I needed, and more than I initially knew to
ask :) Your first response seems to imply an answer to my original
question, but I wanted to follow up to be as sure as I can. In the recovery
scenario, are there situations where a complete index will copy over next
to the original index, thus requiring 2x the disk space? Or is that now
outdated? I could imagine for example the replacement is now done on each
segment at a smaller scale or something along those lines and so recovery
requirements would expect to be on par with merge requirements, or perhaps
there is some "bad enough" scenario where a full side-by-side copy is made
during recovery. Can you comment on that?


SolrCloud recovery uses the replication handler, configuring it on the fly.  This works almost exactly like rsync, and I am pretty sure it mimics the -W option on rsync, copying whole files if the filename already exists but has a different size.

So if the index it's copying is substantially similar to the one you've already got -- at the file level -- the recovery will be fast and not take much space.  But if it's very different (again, at the file level, not at the Lucene level) it very well might copy the entire index over, and then delete the files that make up the existing index.

Thanks,
Shawn

Reply via email to