[jira] [Commented] (SLING-5030) replace isolated mode with (larger) TOPOLOGY_CHANGING phase

Stefan Egli (JIRA) Thu, 24 Sep 2015 06:57:59 -0700

    [ 
https://issues.apache.org/jira/browse/SLING-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14906370#comment-14906370
 ]


Stefan Egli commented on SLING-5030:
------------------------------------

and improved a log output at http://svn.apache.org/viewvc?rev=1705059&view=rev

> replace isolated mode with (larger) TOPOLOGY_CHANGING phase
> -----------------------------------------------------------
>
>                 Key: SLING-5030
>                 URL: https://issues.apache.org/jira/browse/SLING-5030
>             Project: Sling
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: Discovery Impl 1.0.2
>            Reporter: Stefan Egli
>            Assignee: Stefan Egli
>             Fix For: Discovery Impl 1.1.8
>
>
> As [described in 
> SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494]
>  one major reason why duplicate leaders happen in discovery.impl is the 
> isolated mode: the rule of discovery API is that every instance is always in 
> a cluster. That kind of makes sense. However, when the connection to the 
> cluster (ie to the repository) is faulty or delayed for some reason - and the 
> remaining cluster does no longer interpret the local instance as being alive 
> (ie heartbeats have timed out), then currently the local instance notices 
> this 'isolated' state and wraps itself into a pseudo cluster consisting only 
> of itself. Of which it by definition is the leader.
> This is completely wrong: there should be no isolated mode. When this 'cut 
> off' the cluster happens, the local instance should just immediately send out 
> a TOPOLOGY_CHANGING and remain in this state until things have settled with 
> the repository and it successfully has taken part of a voting. Only then can 
> it send out a TOPOLOGY_CHANGED event.
> This should fix a large number of situations where SLING-3432 has been seen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SLING-5030) replace isolated mode with (larger) TOPOLOGY_CHANGING phase

Reply via email to