On 5/24/23 10:48, Walter Underwood wrote:
I think I know how we got into this mess. The cluster is configured and 
deployed into Kubernetes. I think it was rebuilt with more shards then the 
existing storage volumes were mounted for the matching shards. New shards got 
empty volumes. Then the content was reloaded without a delete-all.

You're probably aware... that approach to re-sharding just plain will not work. Increasing or decreasing the shard count of a compositeid-routed collection requires re-indexing from scratch. The only way to add shards to an existing collection is to use SPLITSHARD, unless it's using the implicit router.

I've seen discussion of a rebalance API, but no implementation. It would not be easy to implement. I have thought of one approach that might make it doable ... but it might not be possible to send any updates to the collection until the entire rebalance is complete. Assuming it's even possible, the approach I thought of would require a LOT of extra disk space, a lot of extra bandwidth usage, and would take much longer to run than an optimize. It might even take longer than doing a full re-index from the source system.

Thanks,
Shawn

Reply via email to