Hi Or,

This is a pretty scary report!  If you haven't already, can you open a
JIRA ticket describing how you detect and reproduce the problem?  I'm
reasonably familiar with the restore code, so I can help review if we
do ultimately get a PR to fix this.you do ultimately put a PR
together.

Best,

Jason


On Tue, Dec 24, 2024 at 12:22 PM Or Nagar <ornaga...@gmail.com> wrote:
>
> Hello,
>
> I’ve been working extensively with Apache Solr, specifically with
> large-scale Solr Cloud collections, and I've encountered an issue that I’d
> like to share with you.
>
> In Solr Cloud mode, when a collection has two or more replicas in a shard,
> I noticed a significant problem when restoring from a backup to the shard's
> leader. After the restore, the follower replica fails to start replication.
> This is a critical issue, as it means that the Follower will not have the
> updated data, rendering it Essentially useless.
>
> In the worst case, if the Leader shard becomes unavailable for any reason,
> the outdated Follower could be promoted to leader. This would result in a
> situation where the previously outdated Follower replica becomes the Leader
> and the Leader may even start replicating from what used to be its own
> replica - which will Cause a Full Loss of restoration data.
>
> I tested this behavior in a multi-sharded Solr collection with two replicas
> per shard, and the issue was consistently reproducible. The restore is a
> shard restore.
>
> I am more than willing to contribute to the necessary development to
> resolve this if needed.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org

Reply via email to