Cellar node registrations can be lost for groups and distributed service
endpoints.
-----------------------------------------------------------------------------------
Key: KARAF-852
URL: https://issues.apache.org/jira/browse/KARAF-852
Project: Karaf
Issue Type: Bug
Components: cellar-core
Affects Versions: cellar-2.2.1, cellar-2.2.2
Reporter: Ioannis Canellos
Fix For: cellar-3.0.0, cellar-2.2.3
Groups are stored in a distributed collection. Each group object keeps
internally a set of members (the nodes that are registered to the group). If a
group is registered simultaneously by two or more nodes, the group object will
be overwritten. The result will be that the members that were registered in the
object that was overwritten will be lost.
Exactly the same issue can occur with nodes registering for remote services.
Please note, that this is also a problem when two clusters are getting merged.
This issue will probably never trigger when instances are started manually or
on relatively small size clusters, but its a problem in cloud deployments of
10+ nodes.
One solution would be to use a distributed lock before accessing these
collections, but this won't solve the case of the clusters that merge.
The alternative solution would be to refactor cellar and keep node
registrations in separate collections. So instead of having a collection of
groups and keep nodes inside the Group object, we could have a collection of
group and a multimap of nodes (the key will be the the name and the value will
be the nodes). We will need to check first how hazelcast multi map merge is
handled by Hazelcast. If upon muti map merges the value objects are overriden
instead of appended we will have to work this out the other way around. By
other way around I mean having the groups & dist. services as part of the node.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira