Re: [Discuss] Cloud Controller Clustering Model

Imesh Gunaratne Wed, 03 Dec 2014 14:01:04 -0800

Hi Devs,

I have now implemented the cloud controller coordinator management logic
and improved distributed locking functionality, the changes are now in
master branch.


*Cloud Controller (CC) Coordinator:*
- There will be only one coordinator for the cloud controller cluster at a
given time.
- Coordinator will be the only CC instance that listens to the cluster
status, application status and instance status topics.
- Coordinator will be the only CC instance that publishes topology.
- All the instances will respond to service calls.
- Any change to the CC state will be replicated.
- If the coordinator node goes down another member of the cluster will
become the coordinator and start listening to above topics and publishing
topology.

*Distributed Locking in CC:*
- All the CC service methods are now managed by distributed locks.

Thanks


On Mon, Dec 1, 2014 at 8:09 AM, Imesh Gunaratne <[email protected]> wrote:

> Hi Devs,
>
> I have now completed the initial implementation of $subject and pushed
> those changes to master branch.
>
> Thanks
>
> On Fri, Nov 28, 2014 at 1:49 PM, Gayan Gunarathne <[email protected]> wrote:
>
>> Hi,
>>
>> On Fri, Nov 28, 2014 at 1:00 PM, Akila Ravihansa Perera <
>> [email protected]> wrote:
>>
>>> Hi,
>>>
>>> According to this design Autoscaler (AS)/Stratos Manager (SM) will talk
>>>>> to Cloud Controller (CC) via the Cloud Controller Service endpoint exposed
>>>>> via the load balancer.
>>>>>
>>>>> *Data Replication*
>>>>> When a request comes into one of the CC instances it will execute the
>>>>> necessary actions and update the data holder and/or topology which is in
>>>>> memory. At this point the data holder changes will be replicated to other
>>>>> instances using a distributed map. Once the coordinator receives the above
>>>>> updates it will persist the changes to the registry database.
>>>>>
>>>>
>>>> Are we sending a notification (cluster message) when the distributed
>>>> map updated?
>>>>
>>>
>>> This is handled by Hazelcast OOTB right?
>>>
>>>
>>>>> In this design we might not need to replicate the topology since it is
>>>>> already there in the message broker. The idea is to let coordinator 
>>>>> publish
>>>>> the topology changes and the other members to listen to it.
>>>>>
>>>>
>>> So that means worker nodes listen to the topology as well as cluster
>>> messages? I think we need to clarify this model a bit more.
>>>
>>>
>>>>
>>>> This would add a latency for the events. What are the issues we would
>>>> face, when each node sends out the event? Of course, the complete topology
>>>> should only be sent out by the Coordinator.
>>>>
>>>
>>> Sending out multiple topology events (for eg - MemberActivated,
>>> MemberTerminated) will trigger many listeners multiple times, and that's
>>> probably not a good idea. Or did you mean something else here, sorry I'm
>>> bit confused.
>>>
>>
>> IMO coordinator is the one who needs to make persistence and message
>> publishing.Other instance responsibility to handle the request and update
>> the in-memory data grid.
>>
>>>
>>>
>>>> Also, we need to make CC data publishers activated only when a node is
>>>> the Coordinator.
>>>>
>>>> Further, only the Coordinator should react to the Instance status
>>>> events etc. IMO.
>>>>
>>>
>>> I think this might result in an inconsistent state if the coordinator
>>> fails while processing an instance status event (or any other event for
>>> that matter). Perhaps we can implement a notifier cluster message to
>>> indicate whether incoming events are processed successfully. If the
>>> coordinator fails, the next elected coordinator should be able to pick up
>>> from the last successful event handled.
>>>
>>
>>  +1 we may need to synchronous the new coordinator with the last
>> coordinator status. I guess we may need maintain the coordinator status.
>>
>>>
>>>
>>>> There's a cache to hold the validated partitions of a Cartridge, we
>>>> need to use a distributed hash map for that too.
>>>>
>>>
>>> +1
>>>
>>>
>>>>> Please add your thoughts.
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>> --
>>>>> Imesh Gunaratne
>>>>>
>>>>> Technical Lead, WSO2
>>>>> Committer & PMC Member, Apache Stratos
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Nirmal
>>>>
>>>> Nirmal Fernando.
>>>> PPMC Member & Committer of Apache Stratos,
>>>> Senior Software Engineer, WSO2 Inc.
>>>>
>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>
>>>
>>>
>>>
>>> --
>>> Akila Ravihansa Perera
>>> Software Engineer, WSO2
>>>
>>> Blog: http://ravihansa3000.blogspot.com
>>>
>>
>>
>>
>> --
>>
>> Gayan Gunarathne
>> Technical Lead
>> WSO2 Inc. (http://wso2.com)
>> email  : [email protected]  | mobile : +94 766819985
>>
>>
>
>
>
> --
> Imesh Gunaratne
>
> Technical Lead, WSO2
> Committer & PMC Member, Apache Stratos
>



-- 
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Re: [Discuss] Cloud Controller Clustering Model

Reply via email to