[ 
https://issues.apache.org/jira/browse/KARAF-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré updated KARAF-5562:
----------------------------------------
    Fix Version/s: cellar-4.1.2
                   cellar-4.0.5

> Improve cellar groups configuration synchronisation from hazelcast
> ------------------------------------------------------------------
>
>                 Key: KARAF-5562
>                 URL: https://issues.apache.org/jira/browse/KARAF-5562
>             Project: Karaf
>          Issue Type: Improvement
>          Components: cellar
>    Affects Versions: cellar-4.0.4, cellar-4.1.1
>            Reporter: Thomas Draier
>            Assignee: Jean-Baptiste Onofré
>            Priority: Major
>             Fix For: cellar-4.0.5, cellar-4.1.2
>
>
> We encountered different issues due to HazelcastGroupManager, I'm grouping 
> them here as all of them are linked and we fixed them in a single refactoring 
> of the class. This globally result in a better synchronization of the cellar 
> groups configuration.
> - Hazelcast network splits can result in very bad behaviour on the “groups” 
> shared map - this map contains the list of groups and its members, and the 
> system fully rely on it to know in which groups you are. If multiple nodes 
> updates the map while they are not connected together (easy to reproduce by 
> starting both nodes at the same time), and then join afterwards, the default 
> merge algorithm is applied and simply overwrite the full map. This basically 
> result in groups loosing members, even if the configuration file claims that 
> the nodes are still members. 
> - When handling the groups configuration, HazelcastGroupManager replicates 
> the felix.fileinstall.filename property on each node, containing the 
> configuration file path. It’s quite “ok” if you’re on a cluster with each 
> node installed on the exact same path - however if you’re on the same 
> machine, with 2 nodes on different paths : one node will at one point write 
> on the config file of the other node and never updates its own config, which 
> can be quite confusing.
> - The HazelcastGroupManager can start even when a configuration is not 
> detected by fileinstall yet - it then creates a new config, based on the 
> hazelcast shared config, which will override the config file when fileinstall 
> detects it. It does not have a huge impact, but it shuffles the properties 
> files and makes it unreadable. 
> - The updates from hazelcast to local config trigger back update on hazelcast 
> which goes back to local config and sometimes revert the changes, resulting 
> in no change in the config. Basically , when adding a group, a lot of 
> properties are updated - for each of them we trigger a configuration update. 
> Each configuration update triggers an event which send the whole config back 
> to hazelcast, including properties that are not updated yet, setting them 
> back to their old values. All events (hazelcast updates and osgi config) are 
> treated asynchronously - depending on the orders of events, some properties 
> can be reverted or never added (usually groups property is always reverted 
> after a group add). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to