[jira] [Comment Edited] (SLING-2939) 3rd-party based implementation of discovery.api

Stefan Egli (JIRA) Mon, 01 Jul 2013 08:32:30 -0700

    [ 
https://issues.apache.org/jira/browse/SLING-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13696876#comment-13696876
 ]


Stefan Egli edited comment on SLING-2939 at 7/1/13 3:30 PM:
------------------------------------------------------------

[~ianeboston], [~rombert]: regarding JGroups: I think JGroups is quite a good 
fit, except for two aspects:

 * large installations typically would be udp rather than point-to-point (the 
536 machine cluster for example used udp-multicast). I believe that we would 
like to support Sling deployments across data-centers and use discovery between 
those data-centers for certain admin operations. My concern here is how 
feasible is udp across data-centers.

 * I think the decision can be broken down to two deployment models: embedded 
or dedicated servers. With embedding you have the advantage of no additional 
services required but ideally use multicast (thus running into above concern). 
With a dedicated service there is the downside of such an additional component, 
but the scalability of the point-to-point setup, also cross data-center, seems 
better. (Scalability not in terms of pure performance - there multicast is best 
- but in terms of ease of configuration/setup).
                
      was (Author: egli):
    [~ianeboston], [~rombert]: regarding JGroups: I think JGroups is quite a 
good fit, except for two aspects:

 * large installations typically would be point-to-point rather than udp (the 
536 machine cluster for example used udp-multicast). I believe that we would 
like to support Sling deployments across data-centers and use discovery between 
those data-centers for certain admin operations. My concern here is how 
feasible is udp cross data-centers.

 * I think the decision can be broken down to two deployment models: embedded 
or dedicated servers. With embedding you have the advantage of no additional 
services required but ideally use multicast (thus running into above concern). 
With a dedicated service there is the downside of such an additional component, 
but the scalability of the point-to-point setup, also cross data-center, seems 
better. (Scalability not in terms of pure performance - there multicast is best 
- but in terms of ease of configuration/setup).
                  
> 3rd-party based implementation of discovery.api
> -----------------------------------------------
>
>                 Key: SLING-2939
>                 URL: https://issues.apache.org/jira/browse/SLING-2939
>             Project: Sling
>          Issue Type: Task
>          Components: Extensions
>    Affects Versions: Discovery API 1.0.0
>            Reporter: Stefan Egli
>            Assignee: Stefan Egli
>
> The Sling Discovery API introduces the abstraction of a topology which 
> contains (Sling) clusters and instances, supports liveliness-detection, 
> leader-election within a cluster and property-propagation between the 
> instances. As a default and reference implementation a resource-based, OOTB 
> implementation was created (org.apache.sling.discovery.impl).
> Pros and cons of the discovery.impl
> Although the discovery.impl supports everything required in discovery.api, it 
> has a few limitations. Here's a list of pros and cons:
> Pros
>     No additional software required (leverages repository for intra-cluster 
> communication/storage and HTTP-REST calls for cross-cluster communication)
>     Very small footprint
>     Perfectly suited for a single clusters, instance and for small, rather 
> stable hub-based topologies
> Cons
>     Config-/deployment-limitations (aka embedded-limitation): connections 
> between clusters are peer-to-peer and explicit. To span a topology, a number 
> of instances must (be made) know (to) each other, changes in the topology 
> typically requires config adjustments to guarantee high availability of the 
> discovery service
>         Except if a natural "hub cluster" exists that can serve as connection 
> point for all "satellite clusters"
>         Other than that, it is less suited for large and/or dynamic topologies
>     Change propagation (for topology parts reported via connectors) is 
> non-atomic and slow, hop-by-hop based
>     No guarantee on order of TopologyEvents sent in individual instances - ie 
> different instances might see different orders of TopologyEvents (ie changes 
> in the topology) but eventually the topology is guaranteed to be consistent
>     Robustness of discovery.impl wrt storm situations depends on robustness 
> of underlying cluster (not a real negative but discovery.impl might in theory 
> unveil repository bugs which would otherwise not have been a problem)
>     Rather new, little tested code which might have issues with edge cases 
> wrt network problems
>         although partitioning-support is not a requirement per se, similar 
> edge-cases might exist wrt network-delays/timing/crashes
> Reusing a suitable 3rd party library
> To provide an additional option as implementation of the discovery.api one 
> idea is to use a suitable 3rd party library.
> Requirements
> The following is a list of requirements a 3rd party library must support:
>     liveliness detection: detect whether an instance is up and running
>     stable leader election within a cluster: stable describes the fact that a 
> leader will remain leader until it leaves/crashes and no new, joining 
> instance shall take over while a leader exists
>     stable instance ordering: the list of instances within a cluster is 
> ordered and stable, new, joining instances are put at the end of the list
>     property propagation: propagate the properties provided within one 
> instance to everybody in the topology. there are no timing requirements bound 
> to this but the intention of this is not to be used as messaging but to 
> announce config parameters to the topology
>     support large, dynamic clusters: configuration of the new discovery 
> implementation should be easy and support frequent changes in the (large) 
> topology
>     no single point of failure: this is obvious, there should of course be no 
> single point of failure in the setup
>     embedded or dedicated: this might be a hot topic: embedding a library has 
> the advantages of not having to install anything additional. a dedicated 
> service on the other hand requires additional handling in deployment. 
> embedding implies a peer-to-peer setup: nodes communicate peer-to-peer rather 
> than via a centralized service. this IMHO is a negative for large topologies 
> which would typically be cross data-centers. hence a dedicated service could 
> be seen as an advantage in the end.
>     due to need for cross data-center deployments, the transport protocol 
> must be TCP (or HTTP for that matter)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (SLING-2939) 3rd-party based implementation of discovery.api

Reply via email to