Thanks for your answers.
> The message by Pierre is regarding fixing existing code.
Definitely. Here I want to fix some gaps in the current mechanism for
leader election, which is in my opinion a much smaller work than a full
rework with a different approach.
I will fill a Jira ticket for this
Well we're always operating on consensus, just sometimes it's lazy
consensus. If the sentiment in the community is unclear, we (should)
clarify with a vote before commiting... Ideally it wouldn't get to the
point of a veto. At least that's my understanding.
If Pierre comes up with a patch to fix
My reply might be a little surprising; maybe I hit "send" too quickly. Of
course one should work to invest in getting more consensus; maybe the idea
isn't fully understood; maybe the concerns aren't fully understood. But
consensus isn't so much a state that is achieved or not; it's shades of
You may be surprised at what can be accomplished without "consensus" :-).
Vetoes are the blocker. If you/anyone are convinced enough and put forth a
proposal of what you are going to do, get feedback, and say you are going
to do it (in spite of concerns but obviously try to address them!), go for
The message by Pierre is regarding fixing existing code.
The leader on demand doesn't seem to be a short term solution in any case,
and there wasn't really a consensus around the proposal.
Ilan
On Tue, Dec 19, 2023 at 4:16 PM David Smiley wrote:
> I would be more in favor of going back to the
I would be more in favor of going back to the drawing board on leader
election than incremental improvements. Go back to first principles. The
clarity just isn't there to be maintained. I don't trust it.
Coincidentally I sent a message to the Apache Curator users list yesterday
to inquire
I think it's a worthy problem to address given we (we work at the same
company) ran into a production incident due to it.
Who's familiar and interested enough in leader election code to help review
such changes?
Thanks,
Ilan
On Mon, Dec 18, 2023 at 5:33 PM Pierre Salagnac
wrote:
> We recently
We recently had a couple of issues with production clusters because of race
conditions in shard leader election. By race condition here, in mean for a
single node. I'm not discussing how leader election is distributed
across multiple Solr nodes, but how multiple threads in a single Solr node