[jira] [Created] (SLING-2939) 3rd-party based implementation of discovery.api

Stefan Egli (JIRA) Mon, 01 Jul 2013 06:13:50 -0700

Stefan Egli created SLING-2939:
----------------------------------

             Summary: 3rd-party based implementation of discovery.api
                 Key: SLING-2939
                 URL: https://issues.apache.org/jira/browse/SLING-2939
             Project: Sling
          Issue Type: Task
          Components: Extensions
    Affects Versions: Discovery API 1.0.0
            Reporter: Stefan Egli
            Assignee: Stefan Egli



The Sling Discovery API introduces the abstraction of a topology which contains 
(Sling) clusters and instances, supports liveliness-detection, leader-election 
within a cluster and property-propagation between the instances. As a default 
and reference implementation a resource-based, OOTB implementation was created 
(org.apache.sling.discovery.impl).

Pros and cons of the discovery.impl

Although the discovery.impl supports everything required in discovery.api, it 
has a few limitations. Here's a list of pros and cons:
Pros

    No additional software required (leverages repository for intra-cluster 
communication/storage and HTTP-REST calls for cross-cluster communication)
    Very small footprint
    Perfectly suited for a single clusters, instance and for small, rather 
stable hub-based topologies

Cons

    Config-/deployment-limitations (aka embedded-limitation): connections 
between clusters are peer-to-peer and explicit. To span a topology, a number of 
instances must (be made) know (to) each other, changes in the topology 
typically requires config adjustments to guarantee high availability of the 
discovery service
        Except if a natural "hub cluster" exists that can serve as connection 
point for all "satellite clusters"
        Other than that, it is less suited for large and/or dynamic topologies
    Change propagation (for topology parts reported via connectors) is 
non-atomic and slow, hop-by-hop based
    No guarantee on order of TopologyEvents sent in individual instances - ie 
different instances might see different orders of TopologyEvents (ie changes in 
the topology) but eventually the topology is guaranteed to be consistent
    Robustness of discovery.impl wrt storm situations depends on robustness of 
underlying cluster (not a real negative but discovery.impl might in theory 
unveil repository bugs which would otherwise not have been a problem)
    Rather new, little tested code which might have issues with edge cases wrt 
network problems
        although partitioning-support is not a requirement per se, similar 
edge-cases might exist wrt network-delays/timing/crashes

Reusing a suitable 3rd party library

To provide an additional option as implementation of the discovery.api one idea 
is to use a suitable 3rd party library.


Requirements

The following is a list of requirements a 3rd party library must support:

    liveliness detection: detect whether an instance is up and running
    stable leader election within a cluster: stable describes the fact that a 
leader will remain leader until it leaves/crashes and no new, joining instance 
shall take over while a leader exists
    stable instance ordering: the list of instances within a cluster is ordered 
and stable, new, joining instances are put at the end of the list
    property propagation: propagate the properties provided within one instance 
to everybody in the topology. there are no timing requirements bound to this 
but the intention of this is not to be used as messaging but to announce config 
parameters to the topology
    support large, dynamic clusters: configuration of the new discovery 
implementation should be easy and support frequent changes in the (large) 
topology
    no single point of failure: this is obvious, there should of course be no 
single point of failure in the setup
    embedded or dedicated: this might be a hot topic: embedding a library has 
the advantages of not having to install anything additional. a dedicated 
service on the other hand requires additional handling in deployment. embedding 
implies a peer-to-peer setup: nodes communicate peer-to-peer rather than via a 
centralized service. this IMHO is a negative for large topologies which would 
typically be cross data-centers. hence a dedicated service could be seen as an 
advantage in the end.
    due to need for cross data-center deployments, the transport protocol must 
be TCP (or HTTP for that matter)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (SLING-2939) 3rd-party based implementation of discovery.api

Reply via email to