[ 
https://issues.apache.org/jira/browse/SLING-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-3434:
-------------------------------
    Fix Version/s:     (was: Discovery Impl 1.0.14)

(Removed fix version, as this does not seem critical atm)

An intermediate step of this could be to start with a clock-diff-detection 
mechanism (eg by means of a 'clock vote' where each instance writes down its 
own UTC time) and simply issue a log.error if the clocks differ substantially. 
That would be non-intrusive and could be rolled-out with little 
side-effects/risks.

As a next step, once we have gathered some experience with the stability and 
feasibility of the above, that mechanism could be used to establish a 'cluster 
time zone' and each instance adds the discovered delta to it. That way removing 
the need to warn if the clocks are not in sync..

> Make intra-cluster discovery-heartbeats independent from machine clock 
> differences
> ----------------------------------------------------------------------------------
>
>                 Key: SLING-3434
>                 URL: https://issues.apache.org/jira/browse/SLING-3434
>             Project: Sling
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: Discovery Impl 1.0.2
>            Reporter: Stefan Egli
>
> SLING-2967 fixed an issue where topology connectors were dependent on having 
> machine clocks in sync - so inter-cluster we're no longer dependent on 
> NTP-synching.
> Inside a cluster though, this problem is still there. Since heartbeats are 
> written as absolute time - based on the originator's machine clock - it still 
> only works fine the whole cluster is NTP-synched.
> In general I think this is not a problem as it is best-practice to make sure 
> machines have NTP set up.
> Nevertheless, it would help if discovery.impl could become independent from 
> this.
> Also, if clocks are off by too much, pseudo-network-partitions can occur, 
> with the result of having multiple leaders in a cluster (also see SLING-3432)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to