[ 
https://issues.apache.org/jira/browse/SLING-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16585632#comment-16585632
 ] 

Stefan Egli commented on SLING-7830:
------------------------------------

bq. Are those changes to the leaderElectionIds in the repository picked up by 
all instances?
That's a delicate point. They are indeed picked up by all instances, but the 
switch from one to the next 'cluster view' in this case isn't atomic, and 
that's a problem I guess. 
Normally any topology change is detected and reported by oak's discovery-lite 
via the discovery-lite-descriptor. In Sling that then leads to a 
TOPOLOGY_CHANGING first, then to an increment in the sync counters, and only 
after that, once all instances have updated their sync counters, are 
TOPOLOGY_CHANGED events sent. So this guarantees that upon receiving a 
TOPOLOGY_CHANGED no other instance in the cluster can be still the leader.
Now if we'd just change the leaderElectionId with today's implementation, the 
switch would happen independently on each instance (which could lead to a 
subtle race condition - ie duplicate leaders for a short time - and that might 
be a problem for some topology listeners).
bq. From the part before the underscore the highest value is picked (new), 
while from the part after the underscore the lowest is picked (old).
Interesting idea. It would on the one hand make it perhaps simpler as it would 
just mean to increment new installation's leaderElectionId instead of 
incrementing all the others. On the other hand it would make it more difficult 
to understand how leaderElectionIds are sorted (while as today it's just a 
String.compareTo).
There is one issue here though that might pop up: in my previously suggested 
variant there would be no unwanted/unnecessary topology change at runtime. This 
would be guaranteed as the leaderElectionIds's prefixes would be incremented by 
1 for all instances simultaneously. And that leads to no topology change.
Now in this variant, instead, a new installation increments the prefix. The 
question is, at what time does it do that exactly. I assume it can only do that 
once the instance is up - but that would mean it will create a topology change 
(with a leader change) at runtime - and that is not so nice. If it would be 
possible to set the leaderElectionId while the instance wasn't even started 
yet, then it would be possible. But I'm unsure as to how that could be achieved.
bq. We could control from the outside whether a leader change is wanted and 
wouldn't need to change anything in the repository, just improve the current 
implementation a little bit.
But the leaderElectionId would have to be changed in the repository, right?

> Defined leader switch
> ---------------------
>
>                 Key: SLING-7830
>                 URL: https://issues.apache.org/jira/browse/SLING-7830
>             Project: Sling
>          Issue Type: Improvement
>          Components: Discovery
>            Reporter: Carsten Ziegeler
>            Priority: Major
>
> The current leader selection is based on startup time and sling id (mainly) 
> and is stable across changed in the topology for as long as the leader is up 
> and running.
> However there are use cases like blue green deployment where new instances 
> with a new version are started and taking over the functionality. However 
> with the current discovery setup, the leader would still be one of the 
> instances with the old version.
> With a new deployed version, tasks currently bound to the leader should run 
> on the new version.
> Therefore the leader needs to switch and stay the leader (until it dies).
> We probably need an additional criteria for the leader selection
> /cc [~egli]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to