Stefan Egli created SLING-5027:
----------------------------------

             Summary: vote loop until vote is promoted
                 Key: SLING-5027
                 URL: https://issues.apache.org/jira/browse/SLING-5027
             Project: Sling
          Issue Type: Bug
          Components: Extensions
    Affects Versions: Discovery Impl 1.1.0
            Reporter: Stefan Egli
            Assignee: Stefan Egli
             Fix For: Discovery Impl 1.1.8


{{VotingHandler.analyzeVotings}} has the risk of running into a busy vote loop 
during the duration of an {{/ongoingVotings}}:
* {{analyzeVotings}} is invoked either when something changes in 
{{/var/discovery/impl/ongoingVotings}} or as part of a heartbeat
* as part of this, it figures out the ongoingVotings, decides on which it has 
to vote (should there be more than 1, and it doesn't vote if it was the 
initiator) then potentially does a {{vote(,true)}} on it
* in {{vote()}} it not only sets the {{vote}} property accordingly (true or 
false) - it additionally, since 
[SLING-3434|https://issues.apache.org/jira/browse/SLING-3434?focusedCommentId=14297078&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14297078],
 also sets the {{votedAt}} property (timestamp)
* since the above is a change in {{/ongoingVotings}}, this will trigger an 
observation event - which again triggers the {{analyzeVotings}} to be called, 
which will again find an ongoing vote, vote upon, trigger an observation event 
etc. Resulting in an endless loop that involves repository and an observation 
handler
* now the above loop only occurs since the introduction of the {{votedAt}} 
property - as that is changing on each iteration. Without that, voting again 
with the same boolean would not result in an observation event and the loop 
would not happen at all.
* in any case, the loop lasts at maximum until the initiator finally got all 
the votes and can promote the vote to an {{/establishedView}}. This should 
typically be a fast operation - but if the cluster is under heavy load and 
experiences delays for some reason, this busy loop can last a little while.

So this loop is a regression introduced with SLING-3434.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to