If the usecase is only for Discovery it might be simpler to run Apache
Zookeeper [1]  and use Apache Curator [2] as noted in SLING-2939.
While running on AWS one can possibly use Netflix Exhibitor [3] which
manages the Zookeeper instances and backup there state in S3.

The benefit of this approach is that Zookeeper abstract out all the
complexities of leader election (which is hard!) and can also be used
in on prem installation if required

Chetan Mehrotra
[1] http://zookeeper.apache.org/
[2] http://curator.apache.org/
[3] https://github.com/Netflix/exhibitor

On Mon, May 12, 2014 at 2:54 PM, Timothée Maret
<[email protected]> wrote:
> Hi,
>
> 2014-05-12 9:02 GMT+01:00 Ian Boston <[email protected]>:
>
>> Hi,
>> +1 for distribution of properties via S3, makes perfect sense. Perhaps
>> abstracting behind an API so that any low latency globally distributed
>> storage provider could be used.
>>
>>
> Yes, discussing this offline with Felix, an alternative could be to
> implement a ResourceProvider for S3.
> S3 is really low level (key-value pair) with objects being binaries +
> metadata.
> We could implement the path structure based on the "prefix" property in [3]
> and stick to storing binaries only so that other S3 consumers can access
> the data directly (without using a Sling API).
>
>
>> Not sure about discovery. Although [0] described the AWS VM, it
>> doesn't, without further validation describe if the Sling instance is
>> running and available. Its perfectly possible for the VM to be in a
>> running state, with no viable Sling instance running. I dont think
>> that hard to achieve but it needs to be done to support the discovery
>> use case.
>>
>
> Exactly, ootb, the AWS API has no concept of Sling instance and we should
> implement it.
> According to [2] we could *not* leverage instance metadata since they can't
> be modified at runtime.
> Thus, we would need to have The Sling instances publish their state in S3.
>
>
>> I think we are talking about instances running on independent
>> repositories here, since if all instances share the same repository
>> (ie are a Jackrabbit cluster), then the repository already has a
>> mechanism of communicating running instances via the repository.
>>
>
> +1
>
>
>>
>> Best Regards
>> Ian
>>
>> On 12 May 2014 07:06, Carsten Ziegeler <[email protected]> wrote:
>> > Hi Timotheé,
>> >
>> > yes I think this is valuable - the idea of the discovery API is to
>> abstract
>> > the discovery and if we can benefit in certain scenarios from already
>> > available mechanism/information I think it makes totally sense to use
>> that
>> > instead of adding the same on top of it.
>> >
>> > Right now, the topology is formed of clusters containing instances -
>> where
>> > all instances in a cluster share the same repository, but instances in
>> > different clusters use a different one. Is this kind of topology somehow
>> > possible by using the AWS API? Or would all instances end up in a single
>> > cluster?
>> >
>> > Regards
>> > Carsten
>> >
>> >
>> > 2014-05-11 18:54 GMT+02:00 Timothée Maret <[email protected]>:
>> >
>> >> Hi,
>> >>
>> >> I would like to discuss a potential implementation of the Sling
>> Discovery
>> >> APIs over an eventually consistent distributed storages such as AWS S3.
>> >> Assuming the instances being part of the topology runs in AWS, then we
>> >> could leverage AWS APIs and service in order to implement the Discovery
>> >> mechanism.
>> >>
>> >> The discovery of instances could be implemented implicitely using EC2
>> REST
>> >> API [0] without sending heartbeats, the properties for each instance
>> could
>> >> be stored in AWS S3 and distributed eventually, the leader election
>> could
>> >> be implemented with [1] or similar.
>> >>
>> >> The benefits (over Sling impl) would be
>> >> * Arguably the highest availablity we can get from the environment
>> >> * Reduced bandwith consumption (no hearthbeats)
>> >> * Environment specific informations is implicitely distributed (local
>> ip,
>> >> external ip, hostname, region, etc.)
>> >>
>> >> Of course, it would bind the implementation to an environment (AWS in
>> this
>> >> case), however I believe we could apply the same mechanism to other
>> >> eventually consistent storage.
>> >>
>> >> Wdyt ? Is this something that would be valuable for Sling ?
>> >>
>> >> Regards,
>> >>
>> >> Timothee
>> >>
>> >> [0]
>> >>
>> >>
>> http://docs.aws.amazon.com/AWSEC2/latest/APIReference/ApiReference-query-DescribeInstances.html
>> >> [1] http://gsyc.es/~anto/papers/2007-dsn.pdf
>> >>
>> >
>> >
>> >
>> > --
>> > Carsten Ziegeler
>> > [email protected]
>>
>
> Regards
>
> Timothee
>
> [2]
> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AESDG-chapter-instancedata.html
> [3] http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html

Reply via email to