[
https://issues.apache.org/jira/browse/KARAF-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jean-Baptiste Onofré updated KARAF-5562:
----------------------------------------
Fix Version/s: cellar-4.1.2
cellar-4.0.5
> Improve cellar groups configuration synchronisation from hazelcast
> ------------------------------------------------------------------
>
> Key: KARAF-5562
> URL: https://issues.apache.org/jira/browse/KARAF-5562
> Project: Karaf
> Issue Type: Improvement
> Components: cellar
> Affects Versions: cellar-4.0.4, cellar-4.1.1
> Reporter: Thomas Draier
> Assignee: Jean-Baptiste Onofré
> Priority: Major
> Fix For: cellar-4.0.5, cellar-4.1.2
>
>
> We encountered different issues due to HazelcastGroupManager, I'm grouping
> them here as all of them are linked and we fixed them in a single refactoring
> of the class. This globally result in a better synchronization of the cellar
> groups configuration.
> - Hazelcast network splits can result in very bad behaviour on the “groups”
> shared map - this map contains the list of groups and its members, and the
> system fully rely on it to know in which groups you are. If multiple nodes
> updates the map while they are not connected together (easy to reproduce by
> starting both nodes at the same time), and then join afterwards, the default
> merge algorithm is applied and simply overwrite the full map. This basically
> result in groups loosing members, even if the configuration file claims that
> the nodes are still members.
> - When handling the groups configuration, HazelcastGroupManager replicates
> the felix.fileinstall.filename property on each node, containing the
> configuration file path. It’s quite “ok” if you’re on a cluster with each
> node installed on the exact same path - however if you’re on the same
> machine, with 2 nodes on different paths : one node will at one point write
> on the config file of the other node and never updates its own config, which
> can be quite confusing.
> - The HazelcastGroupManager can start even when a configuration is not
> detected by fileinstall yet - it then creates a new config, based on the
> hazelcast shared config, which will override the config file when fileinstall
> detects it. It does not have a huge impact, but it shuffles the properties
> files and makes it unreadable.
> - The updates from hazelcast to local config trigger back update on hazelcast
> which goes back to local config and sometimes revert the changes, resulting
> in no change in the config. Basically , when adding a group, a lot of
> properties are updated - for each of them we trigger a configuration update.
> Each configuration update triggers an event which send the whole config back
> to hazelcast, including properties that are not updated yet, setting them
> back to their old values. All events (hazelcast updates and osgi config) are
> treated asynchronously - depending on the orders of events, some properties
> can be reverted or never added (usually groups property is always reverted
> after a group add).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)