[ 
https://issues.apache.org/jira/browse/SLING-5435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15123600#comment-15123600
 ] 

Timothee Maret edited comment on SLING-5435 at 1/29/16 3:38 PM:
----------------------------------------------------------------

One example from the list of use case I shared above.

{code}
/*
 * Pattern: Only one instance (the topology leader) does the actual import.
 */
class SomeImporter implements Importer, TopologyEventListener {

        volatile boolean leader;

        public void handleTopologyEvent(final TopologyEvent event) {
        if ( event.getType() == TopologyEvent.Type.TOPOLOGY_CHANGED
                || event.getType() == TopologyEvent.Type.TOPOLOGY_INIT) {
            this.leader = event.getNewView().getLocalInstance().isLeader();
        }
    }

    /**
     * Invoked by the framework.
     */
    publiv void import() {
        if (leader) {
                // do actually import data from a data source into the 
repository.
        } else {
                // do nothing as you are not the leader.
        }
    }
}
{code}

The import method does not need to wait on the repository replication before 
running.
The import method should not need to wait on the repository replication before 
running, in order to avoid the following sequence

{code}
# note: import method is invoked every day by the framework
1. The topology leader leaves the topology (as a result of a crash for instance)
2. The topology elects a new leader
3. The framework invokes the import method
4. The import method runs the "else" branch on every machine even though a 
leader exists
5. The discovery implementation decides to send the TOPOLOGY_CHANGED events (it 
things the changes from the previous leader are visible on every other 
instances)
6. Every SomeImporter on every instance know who is the new leader, they are 
ready for tomorrow import invocation 
# result: the import method did not execute the if(true) branch, even though it 
had everything in place to do so
{code}


was (Author: marett):
One example from the list of use case I shared above.

{code}
/*
 * Pattern: Only one instance (the topology leader) does the actual import.
 */
class SomeImporter implements Importer, TopologyEventListener {

        volatile boolean leader;

        public void handleTopologyEvent(final TopologyEvent event) {
        if ( event.getType() == TopologyEvent.Type.TOPOLOGY_CHANGED
                || event.getType() == TopologyEvent.Type.TOPOLOGY_INIT) {
            this.leader = event.getNewView().getLocalInstance().isLeader();
        }
    }

    /**
     * Invoked by the framework.
     */
    publiv void import() {
        if (leader) {
                // do actually import data from a data source into the 
repository.
        } else {
                // do nothing as you are not the leader.
        }
    }
}
{code}

The import method does not need to wait on the repository replication before 
running.
The import method should not need to wait on the repository replication before 
running, in order to avoid the following sequence

{code}
# note: import method is invoked every day by the framework
1. The topology leader leaves the topology (as a result of a crash for instance)
2. The topology elects a new leader
3. The framework invokes the import method
4. The import method runs the "else" branch on every machine even though a 
leader exists
5. The discovery implementation decides to send the TOPOLOGY_CHANGED events (it 
things the changes from the previous leader are visible on every other 
instances)
# result: the import method did not execute the if(true) branch, even though it 
had everything in place to do so.
{code}

> Decouple processes that depend on cluster leader elections from the cluster 
> leader elections.
> ---------------------------------------------------------------------------------------------
>
>                 Key: SLING-5435
>                 URL: https://issues.apache.org/jira/browse/SLING-5435
>             Project: Sling
>          Issue Type: Improvement
>          Components: General
>            Reporter: Ian Boston
>
> Currently there are many processes in Sling that must complete before a Sling 
> Discovery cluster leader election is declared complete. These processes 
> include things like transferring all Jobs from the old leader to the new 
> leader and waiting for the data to appear visible on the new leader. This 
> introduces an additional overhead to the leader election process which 
> introduces a higher than desirable timeout for elections and heartbeat. This 
> higher than desirable timeout precludes the use of more efficient election 
> and distributed consensus algorithms as implemented in Etcd, Zookeeper or 
> implementations of RAFT.
> If the election could be declared complete leaving individual components to 
> manage their own post election operations (ie decoupling those processes from 
> the election), then faster election or alternative Discovery implementations 
> such as the one implemented on etcd could be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to