[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974312#comment-14974312 ] Stefan Egli commented on SLING-4627: btw: ConsistencyService got renamed into ClusterSyncService.. > TOPOLOGY_CHANGED in an eventually consistent repository > --- > > Key: SLING-4627 > URL: https://issues.apache.org/jira/browse/SLING-4627 > Project: Sling > Issue Type: Improvement > Components: Extensions >Reporter: Stefan Egli >Assignee: Stefan Egli >Priority: Critical > Attachments: SLING-4627.patch, SLING-4627.patch > > > This is a parent ticket describing the +coordination effort needed between > properly sending TOPOLOGY_CHANGED when running ontop of an eventually > consistent repository+. These findings are independent of the implementation > details used inside the discovery implementation, so apply to discovery.impl, > discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific > implementation are best created separately (eg sub-task or related..). Also > note that this assumes immediately sending TOPOLOGY_CHANGING as described [in > SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] > h5. The spectrum of possible TOPOLOGY_CHANGED events include the following > scenarios: > || scenario || classification || action || > | A. change is completely outside of local cluster | (/) uncritical | changes > outside the cluster are considered uncritical for this exercise. | > | B. a new instance joins the local cluster, this new instance is by contract > not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an > instance is uncritical due to the fact that it merely joins the cluster and > has thus no 'backlog' of changes that might be propagating through the > (eventually consistent) repository. | > | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that > were written by the leaving instance might still not be *seen* by all > surviving (ie it can be that discovery is faster than the repository) and > this must be assured before sending out TOPOLOGY_CHANGED. This is because the > leaving instance could have written changes that are *topology dependent* and > thus those changes must first be settled in the repository before continuing > with a *new topology*. | > | D. the leader *leaves* the local cluster (and thus a new leader is elected) > | (x)(x) *very critical* | same as C except that this is more critical due to > the fact that the leader left | > | E. -the leader of the local cluster changes (without leaving)- this is not > supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | > So both C and D are about an instance leaving. And as mentioned above the > survivors must assure they have read all changes of the leavers. There are > two parts to this: > * the leaver could have pending writes that are not yet in mongoD: I don't > think this is the case. The only thing that can remain could be an > uncommitted branch and that would be rolled back afaik. > ** Exception to this is a partition: where the leaver didn't actually crash > but is still hooked to the repository. *For this I'm not sure how it can be > solved* yet. > * the survivers could however not yet have read all changes (pending in the > background read) and one way to make sure they did is to have each surviving > instance write a (pseudo-) sync token to the repository. Once all survivors > have seen this sync token of all other survivors, the assumption is that all > pending changes are "flushed" through the eventually consistent repository > and that it is safe to send out a TOPOLOGY_CHANGED event. > * this sync token must be *conflict free* and could be eg: > {{/var/discovery/oak/clusterInstances//syncTokens/}} - > where {{newViewId}} is defined by whatever discovery mechanism is used > * a special case is when only one instance is remaining. It can then not wait > for any other survivor to send a sync token. In that case sync tokens would > not work. All it could then possibly do is to wait for a certain time (which > should be larger than any expected background-read duration) > [~mreutegg], [~chetanm] can you pls confirm/comment on the above "flush/sync > token" approach? Thx! > /cc [~marett] > \[0\] - see [getLeader() in > ClusterView|https://github.com/apache/sling/blob/trunk/bundles/extensions/discovery/api/src/main/java/org/apache/sling/discovery/ClusterView.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14974314#comment-14974314 ] Stefan Egli commented on SLING-4627: is superceded by SLING-5131 > TOPOLOGY_CHANGED in an eventually consistent repository > --- > > Key: SLING-4627 > URL: https://issues.apache.org/jira/browse/SLING-4627 > Project: Sling > Issue Type: Improvement > Components: Extensions >Reporter: Stefan Egli >Assignee: Stefan Egli >Priority: Critical > Attachments: SLING-4627.patch, SLING-4627.patch > > > This is a parent ticket describing the +coordination effort needed between > properly sending TOPOLOGY_CHANGED when running ontop of an eventually > consistent repository+. These findings are independent of the implementation > details used inside the discovery implementation, so apply to discovery.impl, > discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific > implementation are best created separately (eg sub-task or related..). Also > note that this assumes immediately sending TOPOLOGY_CHANGING as described [in > SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] > h5. The spectrum of possible TOPOLOGY_CHANGED events include the following > scenarios: > || scenario || classification || action || > | A. change is completely outside of local cluster | (/) uncritical | changes > outside the cluster are considered uncritical for this exercise. | > | B. a new instance joins the local cluster, this new instance is by contract > not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an > instance is uncritical due to the fact that it merely joins the cluster and > has thus no 'backlog' of changes that might be propagating through the > (eventually consistent) repository. | > | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that > were written by the leaving instance might still not be *seen* by all > surviving (ie it can be that discovery is faster than the repository) and > this must be assured before sending out TOPOLOGY_CHANGED. This is because the > leaving instance could have written changes that are *topology dependent* and > thus those changes must first be settled in the repository before continuing > with a *new topology*. | > | D. the leader *leaves* the local cluster (and thus a new leader is elected) > | (x)(x) *very critical* | same as C except that this is more critical due to > the fact that the leader left | > | E. -the leader of the local cluster changes (without leaving)- this is not > supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | > So both C and D are about an instance leaving. And as mentioned above the > survivors must assure they have read all changes of the leavers. There are > two parts to this: > * the leaver could have pending writes that are not yet in mongoD: I don't > think this is the case. The only thing that can remain could be an > uncommitted branch and that would be rolled back afaik. > ** Exception to this is a partition: where the leaver didn't actually crash > but is still hooked to the repository. *For this I'm not sure how it can be > solved* yet. > * the survivers could however not yet have read all changes (pending in the > background read) and one way to make sure they did is to have each surviving > instance write a (pseudo-) sync token to the repository. Once all survivors > have seen this sync token of all other survivors, the assumption is that all > pending changes are "flushed" through the eventually consistent repository > and that it is safe to send out a TOPOLOGY_CHANGED event. > * this sync token must be *conflict free* and could be eg: > {{/var/discovery/oak/clusterInstances//syncTokens/}} - > where {{newViewId}} is defined by whatever discovery mechanism is used > * a special case is when only one instance is remaining. It can then not wait > for any other survivor to send a sync token. In that case sync tokens would > not work. All it could then possibly do is to wait for a certain time (which > should be larger than any expected background-read duration) > [~mreutegg], [~chetanm] can you pls confirm/comment on the above "flush/sync > token" approach? Thx! > /cc [~marett] > \[0\] - see [getLeader() in > ClusterView|https://github.com/apache/sling/blob/trunk/bundles/extensions/discovery/api/src/main/java/org/apache/sling/discovery/ClusterView.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14948756#comment-14948756 ] Stefan Egli commented on SLING-4627: this part is now handled in SLING-5131 > TOPOLOGY_CHANGED in an eventually consistent repository > --- > > Key: SLING-4627 > URL: https://issues.apache.org/jira/browse/SLING-4627 > Project: Sling > Issue Type: Improvement > Components: Extensions >Reporter: Stefan Egli >Assignee: Stefan Egli >Priority: Critical > Attachments: SLING-4627.patch, SLING-4627.patch > > > This is a parent ticket describing the +coordination effort needed between > properly sending TOPOLOGY_CHANGED when running ontop of an eventually > consistent repository+. These findings are independent of the implementation > details used inside the discovery implementation, so apply to discovery.impl, > discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific > implementation are best created separately (eg sub-task or related..). Also > note that this assumes immediately sending TOPOLOGY_CHANGING as described [in > SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] > h5. The spectrum of possible TOPOLOGY_CHANGED events include the following > scenarios: > || scenario || classification || action || > | A. change is completely outside of local cluster | (/) uncritical | changes > outside the cluster are considered uncritical for this exercise. | > | B. a new instance joins the local cluster, this new instance is by contract > not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an > instance is uncritical due to the fact that it merely joins the cluster and > has thus no 'backlog' of changes that might be propagating through the > (eventually consistent) repository. | > | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that > were written by the leaving instance might still not be *seen* by all > surviving (ie it can be that discovery is faster than the repository) and > this must be assured before sending out TOPOLOGY_CHANGED. This is because the > leaving instance could have written changes that are *topology dependent* and > thus those changes must first be settled in the repository before continuing > with a *new topology*. | > | D. the leader *leaves* the local cluster (and thus a new leader is elected) > | (x)(x) *very critical* | same as C except that this is more critical due to > the fact that the leader left | > | E. -the leader of the local cluster changes (without leaving)- this is not > supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | > So both C and D are about an instance leaving. And as mentioned above the > survivors must assure they have read all changes of the leavers. There are > two parts to this: > * the leaver could have pending writes that are not yet in mongoD: I don't > think this is the case. The only thing that can remain could be an > uncommitted branch and that would be rolled back afaik. > ** Exception to this is a partition: where the leaver didn't actually crash > but is still hooked to the repository. *For this I'm not sure how it can be > solved* yet. > * the survivers could however not yet have read all changes (pending in the > background read) and one way to make sure they did is to have each surviving > instance write a (pseudo-) sync token to the repository. Once all survivors > have seen this sync token of all other survivors, the assumption is that all > pending changes are "flushed" through the eventually consistent repository > and that it is safe to send out a TOPOLOGY_CHANGED event. > * this sync token must be *conflict free* and could be eg: > {{/var/discovery/oak/clusterInstances//syncTokens/}} - > where {{newViewId}} is defined by whatever discovery mechanism is used > * a special case is when only one instance is remaining. It can then not wait > for any other survivor to send a sync token. In that case sync tokens would > not work. All it could then possibly do is to wait for a certain time (which > should be larger than any expected background-read duration) > [~mreutegg], [~chetanm] can you pls confirm/comment on the above "flush/sync > token" approach? Thx! > /cc [~marett] > \[0\] - see [getLeader() in > ClusterView|https://github.com/apache/sling/blob/trunk/bundles/extensions/discovery/api/src/main/java/org/apache/sling/discovery/ClusterView.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532182#comment-14532182 ] Stefan Egli commented on SLING-4627: [~marett], thanks for the review comments!! bq. If this understanding is correct, I'd have a few questions: yes, that's as I understand it too. bq. 1. absolutely, it is problematic. The original idea was rule #6 - the {{minEventDelay}} - but is probably too simplistic. I have actually taken your input to rethink rule #6 and I think it must be integrated into OAK-2844 - I've added a comment over there that suggests to use oak's insight to delay sending the discovery-light's cluster changed event (I have yet to flesh out all the details, but an initial comment I added [there|https://issues.apache.org/jira/browse/OAK-2844?focusedCommentId=14531291page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14531291]). bq. 2. This should be covered by the definition of the sync token. Whenever the 'cluster view detection mechanism' declares there is a new cluster view (be it via voting or via atomic-updating-in-a-shared-resource) then that view must carry a unique id - that id can then be used as the sync token id. And yes, a particular instance could still be 'somewhat behind' and flag that it is changing to an older view (especially given eventual consistency delays as these sync tokens go through the repository) - *but* eventually it will also see the latest and greatest cluster view and it will also send a sync token for that. And all the others are already waiting for that new sync token. It all is based on the fact that the discovery mechanism has a different delay compared to the repository - but with this coupling via sync token, this difference can be handled. (PS: the suggestion will be to use the {{ClusterView.getId()}} of OAK-2844 as the sync token) {quote}I don't know the details of Oak very well, but maybe there is queue of data to be replicated somewhere. Getting a hand on this queue may offer such guarantee that data has been replicated up to the point in time X. Assuming such queue exists each instance could write a piece of data at the time X and wait until it sees it out of the queue (or written in the Journal). This would allow to keep each instance to care only about themselves.{quote} That is a good point indeed! And as mentioned it made me rethink rule #6 and I now believe OAK-2844 should be used to process oak's inside knowledge - as indeed it has last known revision ids of all instances, so it just needs to [combine the dots|https://issues.apache.org/jira/browse/OAK-2844?focusedCommentId=14531291page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14531291] TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Assignee: Stefan Egli Priority: Critical Attachments: SLING-4627.patch, SLING-4627.patch This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532188#comment-14532188 ] Stefan Egli commented on SLING-4627: One additional comment: even with taking oak's knowledge of the last known revision and rollback-on-a-crash into account will the syncToken be necessary: one instance cannot know when another instance has exactly processed the fact that the view is changing - so it cannot identify a point in time after which it is safe to continue. The only way is an explicit communication, a message, from each recipient, each instance: that message says: yes, I have taken note of the view changing - and flags that with an id. In other cases this message would not be necessary, as that's already achieved as part of the voting - *but* if the voting (to agree on a new view) is done *outside* of the repository, then it is not in sync with it - hence a sync token is then necessary. If the voting is done *inside* the repository, a sync token is not needed. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Assignee: Stefan Egli Priority: Critical Attachments: SLING-4627.patch, SLING-4627.patch This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14532334#comment-14532334 ] Stefan Egli commented on SLING-4627: [~marett], I've added a suggestion how the {{ConsistencyService}} could be [informed about a backlog by oak|https://issues.apache.org/jira/browse/OAK-2844?focusedCommentId=14532315page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14532315]. With that we should be able to solve both actually shutdown/crashed instances, as well as partitioned instances (partition in the sense that the implementor of discovery.api declares it dead but oak does not) TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Assignee: Stefan Egli Priority: Critical Attachments: SLING-4627.patch, SLING-4627.patch This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only one instance is remaining. It can then not wait for any other survivor to send a sync token. In that case sync tokens would not work. All it could then possibly do is to wait for a certain time (which should be larger than any expected background-read duration) [~mreutegg], [~chetanm] can you pls confirm/comment on the above flush/sync token approach? Thx! /cc
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14530846#comment-14530846 ] Timothee Maret commented on SLING-4627: --- IIUC, this would mean that whenever the topology changes at a point in time X, each remaining instances in the topology would write a syncToken in the repository. Each instances would deem the data as being replicated up the point X, when they see most (or maybe all remaining) syncTokens. If this understanding is correct, I'd have a few questions: 1. How can this mechanism guarantee that the data written by the instance that crashed at point X is replicated, assuming it can't write its own syncToken ? Assuming this instance is the leader, this may be problematic. 2. How would the syncToken be reused in consecutive consistency checks ? Couldn't it be possible that some instances see an old consistency token and consider it as valid ? I don't know the details of Oak very well, but maybe there is queue of data to be replicated somewhere. Getting a hand on this queue may offer such guarantee that data has been replicated up to the point in time X. Assuming such queue exists each instance could write a piece of data at the time X and wait until it sees it out of the queue (or written in the Journal). This would allow to keep each instance to care only about themselves. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Assignee: Stefan Egli Priority: Critical Attachments: SLING-4627.patch, SLING-4627.patch This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513926#comment-14513926 ] Timothee Maret commented on SLING-4627: --- bq. not sure 'Partition' is the best name It was refering to the mathematical set partitions https://en.wikipedia.org/wiki/Partition_of_a_set I agree it can be confusing in the context of discovery service where we have network partitions :-) Since a partition is essentially a set, I'd go with [~cziegeler] suggestion of renaming it to InstanceSet and keep the partition speak for the documentation. Regarding location, I agree it would make more sense to move it to a commons bundle. The implementation can be reused against different implementations of the Discovery API. The helper contained in the patch allows to filter the InstanceSet by specifying an unbounded list of filters. However, the logical binding between the filters is limited to AND (no possibility to use OR, or NOT). With Java 8, we could represent those filters as Predicate which would allow to compose filters with and, or and not. But I believe Java 8 is not yet the minimum supported platform for Sling, so we could either: 1. Leave with the limitation (stick to AND), after all it is still possible to specify an OR or NOT by wrapping InstanceFilters ; or 2. Implement a ad-hoc Predicate for the InstanceSet wdyt ? In any case, I have opened SLING-4665 to track adding this helper class. It may require creating a jira component for sling.commons. [0] https://docs.oracle.com/javase/8/docs/api/java/util/function/Predicate.html TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical Attachments: SLING-4627.patch This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513793#comment-14513793 ] Stefan Egli commented on SLING-4627: Looks interesting! I haven't looked at it in detail, just a few comments: * not sure 'Partition' is the best name there or whether it should be called just 'Part' or 'Sub[topology]' or something - Partition to me has a special connotation due to SLING-3432 :) * regarding where to put this: just thinking whether this could/should be put to discovery.support or a new discovery.commons rather than discovery.api (as this would not necessarily have to become part of the api). wdyt? TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical Attachments: SLING-4627.patch This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only one instance is remaining. It can then not wait for any other survivor to send a sync token. In that case sync tokens would not work. All it could then possibly do is to wait for a certain time (which should be larger than any expected background-read duration) [~mreutegg], [~chetanm] can you pls confirm/comment on the above flush/sync token approach? Thx! /cc [~marett] \[0\] - see [getLeader() in
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513825#comment-14513825 ] Carsten Ziegeler commented on SLING-4627: - discovery.commons sounds good to me (discovery.support has a different intention) For the naming, I think it's basically a collection of instances and with any filter, this collection could be anything. So I would call it InstanceCollection or InstanceSet or something along those lines. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical Attachments: SLING-4627.patch This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only one instance is remaining. It can then not wait for any other survivor to send a sync token. In that case sync tokens would not work. All it could then possibly do is to wait for a certain time (which should be larger than any expected background-read duration) [~mreutegg], [~chetanm] can you pls confirm/comment on the above flush/sync token approach? Thx! /cc [~marett] \[0\] - see [getLeader() in ClusterView|https://github.com/apache/sling/blob/trunk/bundles/extensions/discovery/api/src/main/java/org/apache/sling/discovery/ClusterView.java] -- This message was sent by Atlassian JIRA
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510943#comment-14510943 ] Stefan Egli commented on SLING-4627: I think the core of your suggestion is to add some help (by the api) to simplify the work of a listener when it has to find out if an instance has joined or left the local cluster. That could indeed make listener's life easier. However I don't think we necessarily have to change the api for this: I think all the information is already encapsulated in the TopologyEvent: it knows if it is a CHANGED event (only then this help is needed), it has the old and the new view and can provide additional helper methods easily - or by a adjacent TopologyEventHelper for example. So I think you could add this functionality to the discovery.api without any need for change by any discovery.*impl, and even remaining backwards compatible by adding eg the following methods to either TopologyEvent or creating a new TopologyEventHelper (in the latter case the methods would of course look slightly different): {code} // added to either TopologyEvent or slightly different to a new TopologyEventHelper /** * @return the list of {@code InstanceDescription} for each instance that * have been added. */ ListInstanceDescription getAddedInstances() { //TODO } /** * * @return the list of {@code InstanceDescription} for each instance that * have been removed. */ ListInstanceDescription getRemovedInstances() { //TODO } {code} wdyt? TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512206#comment-14512206 ] Timothee Maret commented on SLING-4627: --- Yes, I believe we could provide better tooling for figuring out the diff in topology views. I have added a patch (SLING-4627.patch) which contains a potential version for it. It would be some sort of builder that can takes either a TopologyEvent or a set of TopologyViews and allows to first partition the changes (added, removed, retained) and then apply further filtering on the partition (the most likely are by default, custom filtering leverage the InstanceFilter interface). The patch has no test, and likely is buggy (I haven't tried it..) but it is good enough for the approach. Assuming we add this, then I believe this issue could be solved without SPI and leaving each topology listener to figure out what to do with topology changes. For instance they could do {code} TopologyEvent event = ...; // Leader instances that have been removed and are in the local cluster ClusterView localView = event.getOldView().getLocalInstance().getClusterView(); SetString slingIds = new TopologyViewChange(event) .removed() .isLeader() .isInClusterView(localView) .get(); if (slingIds.size() 0) { // do something ... } // Non leader, non local instances, from any cluster view, that match some custom properties and have been added TopologyView v1 = ...; TopologyView v2 = ...; InstanceFilter customFilter = new InstanceFilter() { public boolean accept(InstanceDescription instance) { return instance.getProperties().containsKey(my-property); } }; SetString slingIds = new TopologyViewChange(v1, v2) .added() .isNotLeader() .isNotLocal() .filterWith(customFilter).get(); if (slingIds.size() 0) { // do something ... } // Non leader instances that have been retained in a specific cluster view ClusterView specificView = ...; SetString slingIds = new TopologyViewChange(v1, v2) .retained(true) .isInClusterView(specificView) .get(); if (slingIds.size() 0) { // do something ... } // Leader instances which properties have changed in any cluster view SetString slingIds = new TopologyViewChange(v1, v2) .retained(true, true) .get(); if (slingIds.size() 0) { // do something ... } {code} TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical Attachments: SLING-4627.patch This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509036#comment-14509036 ] Timothee Maret commented on SLING-4627: --- Thinking about this again. In order to support the cases described in this issue, what has to be decided is whether or not an instance has left the cluster the local instance is a member of. If this holds, then some wait for cluster replication has to be enforced. It has been pointed that the work of deciding this case would need to be replicated to all listeners (who depend on cluster replication). However, I think we may avoid this work, by extending the list of events sent by the discovery service. Instead of sending only the TOPOLOGY_CHANGED event, how about extending the API with a new pair of Event/EventListener ? Those would be like {code} /* * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * License); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, * software distributed under the License is distributed on an * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY * KIND, either express or implied. See the License for the * specific language governing permissions and limitations * under the License. */ package org.apache.sling.discovery; import java.util.Arrays; import java.util.Collections; import java.util.List; /** * An instance event is sent whenever a change in the topology occurs. * * This event object might be extended in the future with new event types and * methods. * * @see InstanceEventListener */ public class InstanceEvent { public static enum Type { /** * Informs the service about one or more instances added to the topology. */ ADDED, /** * Informs the service about one or more instances added to the topology. */ REMOVED, /** * One or many properties have been changed on the instance. */ PROPERTIES_CHANGED } private final Type type; private final ListInstanceDescription instanceDescriptions; public InstanceEvent(final Type type, final InstanceDescription... instanceDescription) { if (type == null) { throw new IllegalArgumentException(type must not be null); } if (instanceDescription.length == 0) { throw new IllegalArgumentException(One or more instance description must be provided); } this.type = type; this.instanceDescriptions = Collections.unmodifiableList( Arrays.asList(instanceDescription)); } /** * Returns the type of this event * * @return the type of this event */ public Type getType() { return type; } /** * Returns the view which was valid up until now. * p * This is null in case of codeTOPOLOGY_INIT/code * * @return the view which was valid up until now, or null in case of a fresh * instance start */ public ListInstanceDescription getInstances() { return instanceDescriptions; } @Override public String toString() { return InstanceEvent{ + type= + type + , instanceDescriptions= + instanceDescriptions + '}'; } } {code} Using dedicated handler would allow to only dispatch the events to the pieces of the code that may require it (such as the Sling JobManager) in order to take action. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502762#comment-14502762 ] Stefan Egli commented on SLING-4627: I tend to agree that we're overloading discovery.impl by adding such a sync-mechanism to it - while the argument could indeed be, as [~marett] pointed out, that this is something for the 'application level' - eg the JobManager etc. [~cziegeler], wdyt? Should this become a utility of some sort perhaps, outside of discovery.impl? TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only one instance is remaining. It can then not wait for any other survivor to send a sync token. In that case sync tokens would not work. All it could then possibly do is to wait for a certain time (which should be larger than any expected background-read duration) [~mreutegg], [~chetanm] can you pls confirm/comment on the above flush/sync token approach? Thx! /cc [~marett] \[0\] - see [getLeader() in ClusterView|https://github.com/apache/sling/blob/trunk/bundles/extensions/discovery/api/src/main/java/org/apache/sling/discovery/ClusterView.java] -- This message was sent by Atlassian JIRA
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502789#comment-14502789 ] Carsten Ziegeler commented on SLING-4627: - I think the application itself can't handle this - it has not enough knowledge of the implementation details to do so (which is good). This also includes that the application does not know whether the data store has strong consistency or is eventual consistent. Let's assume for a second, we don't solve it within the discovery impl. What would the app code exactly do? It will receive a topology event, but does not really know what changed - so the app needs to do a diff to find out which of the above cases it is. If it's none of the critical cases, fine - if it is a critical case, the app calls a wait-for-sync utility. Where is this defined? How does the api look like? I have the feeling that we can't abstract this in a nice way, and then every app code using a topology listener needs to do this stuff. So what about this: the discovery impl can be configured to use such a sync service. If it's configured to do so, it looks it up and uses it which means the discovery impl waits for sending out the changed event until the sync service notified so. With this, we just create such a SPI interface in the discovery - but leave the implementation to someone else. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502790#comment-14502790 ] Carsten Ziegeler commented on SLING-4627: - I think the application itself can't handle this - it has not enough knowledge of the implementation details to do so (which is good). This also includes that the application does not know whether the data store has strong consistency or is eventual consistent. Let's assume for a second, we don't solve it within the discovery impl. What would the app code exactly do? It will receive a topology event, but does not really know what changed - so the app needs to do a diff to find out which of the above cases it is. If it's none of the critical cases, fine - if it is a critical case, the app calls a wait-for-sync utility. Where is this defined? How does the api look like? I have the feeling that we can't abstract this in a nice way, and then every app code using a topology listener needs to do this stuff. So what about this: the discovery impl can be configured to use such a sync service. If it's configured to do so, it looks it up and uses it which means the discovery impl waits for sending out the changed event until the sync service notified so. With this, we just create such a SPI interface in the discovery - but leave the implementation to someone else. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502859#comment-14502859 ] Stefan Egli commented on SLING-4627: +1 for such an SPI - could be reused by all implementations of discovery TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only one instance is remaining. It can then not wait for any other survivor to send a sync token. In that case sync tokens would not work. All it could then possibly do is to wait for a certain time (which should be larger than any expected background-read duration) [~mreutegg], [~chetanm] can you pls confirm/comment on the above flush/sync token approach? Thx! /cc [~marett] \[0\] - see [getLeader() in ClusterView|https://github.com/apache/sling/blob/trunk/bundles/extensions/discovery/api/src/main/java/org/apache/sling/discovery/ClusterView.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503155#comment-14503155 ] Timothee Maret commented on SLING-4627: --- I think the implementation of a mechanism covering the cases mentioned above (critical or not) must differentiate between repository implementations it runs on (on some cluster technology it waits more, on some other there may be a mechanism to be notified upon replication, etc.) and could be implemented on top of the Sling discovery API (by keeping track of who is the leader and diffing the properties). So +1 if the SPI can lead to an implementation that takes into account the repository and that can be reused among Sling discovery implementation. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only one instance is remaining. It can then not wait for any other survivor to send a sync token. In that case sync tokens would not work. All it could then possibly do is to wait for a certain time (which should be larger than any expected background-read duration) [~mreutegg], [~chetanm] can you pls confirm/comment on the above flush/sync token approach? Thx! /cc
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496362#comment-14496362 ] Timothee Maret commented on SLING-4627: --- IMO using a 3rd party implementation for the discovery implementation (such as zookeeper or etcd) can be achieved without storing data in the repository powering the Sling instance. zk or etcd provide primitives to guarantee that at any time, only one leader exist in the topology. With this in mind, whether the repository is eventually consistent or not should be irrelevant. The event TOPOLOGY_CHANGED should IMO be sent whenever the topology has changed and not be used as a synchronization mechanism for the underlying repository. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only one instance is remaining. It can then not wait for any other survivor to send a sync token. In that case sync tokens would not work. All it could then possibly do is to wait for a certain time (which should be larger than any expected background-read duration) [~mreutegg], [~chetanm] can you pls confirm/comment on the above flush/sync token approach? Thx! /cc [~marett] \[0\] - see
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496410#comment-14496410 ] Stefan Egli commented on SLING-4627: So some topology listeners will require such a 'sync mechanism' after a TOPOLOGY_CHANGED - I guess whether this is part of discovery or not is indeed an architectural question. The advantage of having it inside discovery is that the listeners do not need to take care. The disadvantage is that all discovery implementations must implement this and that you can't react faster as a topology listener even if you wanted to. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only one instance is remaining. It can then not wait for any other survivor to send a sync token. In that case sync tokens would not work. All it could then possibly do is to wait for a certain time (which should be larger than any expected background-read duration) [~mreutegg], [~chetanm] can you pls confirm/comment on the above flush/sync token approach? Thx! /cc [~marett] \[0\] - see [getLeader() in
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496496#comment-14496496 ] Stefan Egli commented on SLING-4627: PS: Just for the record: when running discovery.impl with a *large-enough heartbeat timeout*, this issue is much less likely. a) here such a 'sync-token' is automatically sent in the form of the promotion of the {{/establishedViews}} and b) due to the fact that the heartbeat timeout is (this time) set larger than any expected delay. When thus switching from discovery.impl to eg discovery.etcd this problem is newly exposed. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only one instance is remaining. It can then not wait for any other survivor to send a sync token. In that case sync tokens would not work. All it could then possibly do is to wait for a certain time (which should be larger than any expected background-read duration) [~mreutegg], [~chetanm] can you pls confirm/comment on the above flush/sync token approach? Thx! /cc [~marett] \[0\] - see [getLeader() in
[jira] [Commented] (SLING-4627) TOPOLOGY_CHANGED in an eventually consistent repository
[ https://issues.apache.org/jira/browse/SLING-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496429#comment-14496429 ] Timothee Maret commented on SLING-4627: --- +1 IMO the application (job manager or else) must be developed taking into account the characteristics of the underlying storage. Leaking job manager requirement in the topology impl may be the first adjustment of a long serie where each application comes with their own requirement. TOPOLOGY_CHANGED in an eventually consistent repository --- Key: SLING-4627 URL: https://issues.apache.org/jira/browse/SLING-4627 Project: Sling Issue Type: Improvement Components: Extensions Reporter: Stefan Egli Priority: Critical This is a parent ticket describing the +coordination effort needed between properly sending TOPOLOGY_CHANGED when running ontop of an eventually consistent repository+. These findings are independent of the implementation details used inside the discovery implementation, so apply to discovery.impl, discovery.etcd/.zookeeper/.oak etc. Tickets to implement this for specific implementation are best created separately (eg sub-task or related..). Also note that this assumes immediately sending TOPOLOGY_CHANGING as described [in SLING-3432|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] h5. The spectrum of possible TOPOLOGY_CHANGED events include the following scenarios: || scenario || classification || action || | A. change is completely outside of local cluster | (/) uncritical | changes outside the cluster are considered uncritical for this exercise. | | B. a new instance joins the local cluster, this new instance is by contract not the leader (leader must be stable \[0\]) | (/) uncritical | a join of an instance is uncritical due to the fact that it merely joins the cluster and has thus no 'backlog' of changes that might be propagating through the (eventually consistent) repository. | | C. a non-leader *leaves* the local cluster | (x) *critical* | changes that were written by the leaving instance might still not be *seen* by all surviving (ie it can be that discovery is faster than the repository) and this must be assured before sending out TOPOLOGY_CHANGED. This is because the leaving instance could have written changes that are *topology dependent* and thus those changes must first be settled in the repository before continuing with a *new topology*. | | D. the leader *leaves* the local cluster (and thus a new leader is elected) | (x)(x) *very critical* | same as C except that this is more critical due to the fact that the leader left | | E. -the leader of the local cluster changes (without leaving)- this is not supported by contract (leader must be stable \[0\]) | (/) -irrelevant- | | So both C and D are about an instance leaving. And as mentioned above the survivors must assure they have read all changes of the leavers. There are two parts to this: * the leaver could have pending writes that are not yet in mongoD: I don't think this is the case. The only thing that can remain could be an uncommitted branch and that would be rolled back afaik. ** Exception to this is a partition: where the leaver didn't actually crash but is still hooked to the repository. *For this I'm not sure how it can be solved* yet. * the survivers could however not yet have read all changes (pending in the background read) and one way to make sure they did is to have each surviving instance write a (pseudo-) sync token to the repository. Once all survivors have seen this sync token of all other survivors, the assumption is that all pending changes are flushed through the eventually consistent repository and that it is safe to send out a TOPOLOGY_CHANGED event. * this sync token must be *conflict free* and could be eg: {{/var/discovery/oak/clusterInstances/slingId/syncTokens/newViewId}} - where {{newViewId}} is defined by whatever discovery mechanism is used * a special case is when only one instance is remaining. It can then not wait for any other survivor to send a sync token. In that case sync tokens would not work. All it could then possibly do is to wait for a certain time (which should be larger than any expected background-read duration) [~mreutegg], [~chetanm] can you pls confirm/comment on the above flush/sync token approach? Thx! /cc [~marett] \[0\] - see [getLeader() in ClusterView|https://github.com/apache/sling/blob/trunk/bundles/extensions/discovery/api/src/main/java/org/apache/sling/discovery/ClusterView.java] -- This message was sent by Atlassian JIRA (v6.3.4#6332)