[ 
https://issues.apache.org/jira/browse/SLING-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16582255#comment-16582255
 ] 

Stefan Egli edited comment on SLING-7830 at 8/16/18 9:31 AM:
-------------------------------------------------------------

The leader election is based on the leaderElectionId stored in the repository 
under {{/var/discovery/oak/clusterInstances}}. When a leader starts up, it 
stores its own leaderElectionId there. As Carsten mentioned, that's made up of 
a prefix, then the start time and the slingId (to avoid clashes). At the time 
the cluster view is analysed, the leader is the one with the *lowest* 
leaderElectionId (String comparison).

That fact can be used for example in the following procedure:
 * before bringing up new (eg blue) instances, put the old (eg green) 
instances's leaderElectionIds to the back of the leader comparison order by 
incrementing for example the prefix, eg. replace the leaderElectionIds from 
*{{1}}*{{_0000001534409616936_374019fc-68bd-4c8d-a4cf-8ee8b07c63bc}} to 
*{{2}}*{{_0000001534409616936_374019fc-68bd-4c8d-a4cf-8ee8b07c63bc}} (and do 
the same for *all* old instances). Do this in *1 jcr transaction* (otherwise 
there will be an *unwanted* leader change in the old cluster). If some 
leaderElectionIds already have a prefix of eg 2, increment that to 3, etc
 * then bring up the new (eg green) instances. (One of) the new instance(s) 
will automatically become leader, since the prefix is {{1}} by default and thus 
lower than the old instances.

We could be looking at automating something like this and providing it via some 
API/JMX..

*PS*: this could be done entirely outside of discovery.oak


was (Author: egli):
The leader election is based on the leaderElectionId stored in the repository 
under {{/var/discovery/oak/clusterInstances}}. When a leader starts up, it 
stores its own leaderElectionId there. As Carsten mentioned, that's made up of 
a prefix, then the start time and the slingId (to avoid clashes). At the time 
the cluster view is analysed, the leader is the one with the *lowest* 
leaderElectionId (String comparison).

That fact can be used for example in the following procedure:
 * before bringing up new (eg blue) instances, put the old (eg green) 
instances's leaderElectionIds to the back of the leader comparison order by 
incrementing for example the prefix, eg. replace the leaderElectionIds from 
*{{1}}*{{_0000001534409616936_374019fc-68bd-4c8d-a4cf-8ee8b07c63bc}} to 
*{{2}}*{{_0000001534409616936_374019fc-68bd-4c8d-a4cf-8ee8b07c63bc}} (and do 
the same for *all* old instances). Do this in *1 jcr transaction* (otherwise 
there will be an *unwanted* leader change in the old cluster). If some 
leaderElectionIds already have a prefix of eg 2, increment that to 3, etc
 * then bring up the new (eg green) instances. (One of) the new instance(s) 
will automatically become leader, since the prefix is {{1}} by default and thus 
lower than the old instances.

We could be looking at automating something like this and providing it via some 
API/JMX..

> Defined leader switch
> ---------------------
>
>                 Key: SLING-7830
>                 URL: https://issues.apache.org/jira/browse/SLING-7830
>             Project: Sling
>          Issue Type: Improvement
>          Components: Discovery
>            Reporter: Carsten Ziegeler
>            Priority: Major
>
> The current leader selection is based on startup time and sling id (mainly) 
> and is stable across changed in the topology for as long as the leader is up 
> and running.
> However there are use cases like blue green deployment where new instances 
> with a new version are started and taking over the functionality. However 
> with the current discovery setup, the leader would still be one of the 
> instances with the old version.
> With a new deployed version, tasks currently bound to the leader should run 
> on the new version.
> Therefore the leader needs to switch and stay the leader (until it dies).
> We probably need an additional criteria for the leader selection
> /cc [~egli]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to