Hi, I'm using karaf (4.1.3) inside kubernetes and leveraging Cellar (4.1.1) discover integration with it. Just for context I had do develop a custom integration with Kubernetes since the karaf cellar kubernetes bundle is not updated for the most recent versions of kubernetes (I'm using 1.9.0). I will address that in another issue
For the issue in subject I've found a couple of problems with configurations. I've search the forums and read all the points related with this and I know that there is problems with factories. To add to this besides file install that comes bundled with karaf I'm using sling file installer (https://sling.apache.org/documentation/bundles/file-installer-provider.html) which watches a separate directory (I really need the runmode functionality). I don't think that this is the cause of the problem because as mentioned by JB in another thread cellar watches config admin events so it's completely unaware of the provider of configurations. Now for the issues. As soon as I scale from 1 to 2 nodes in kubernetes all goes well with the discovery process and cellar effectively discovers the nodes and starts the synchronization process. I'm providing my docker container with configurations via kubernetes config map functionality and all my configs end in a directory which sling file installer watches. The first problem is the exception bellow: 2018-03-10T09:23:21,181 | INFO | CM Event Dispatcher (Fire ConfigurationEvent: pid=platform.modules.security.asymmetrickey.rsa.factory.1bd580f6-f892-467c-96e8-869683384394) | fileinstall | 6 - org.apache.felix.fileinstall - 3.5.8 | Unable to save configuration java.lang.NullPointerException: null at org.apache.felix.fileinstall.internal.ConfigInstaller.doConfigurationEvent(ConfigInstaller.java:130) [6:org.apache.felix.fileinstall:3.5.8] at org.apache.felix.fileinstall.internal.ConfigInstaller.configurationEvent(ConfigInstaller.java:108) [6:org.apache.felix.fileinstall:3.5.8] at org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.sendEvent(ConfigurationManager.java:2090) [5:org.apache.felix.configadmin:1.8.16] at org.apache.felix.cm.impl.ConfigurationManager$FireConfigurationEvent.run(ConfigurationManager.java:2058) [5:org.apache.felix.configadmin:1.8.16] at org.apache.felix.cm.impl.UpdateThread.run0(UpdateThread.java:141) [5:org.apache.felix.configadmin:1.8.16] at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:109) [5:org.apache.felix.configadmin:1.8.16] at java.lang.Thread.run(Thread.java:748) [?:?] I've tracked down this to https://issues.apache.org/jira/browse/FELIX-5125 which is resolved and published in version 3.6.4 of file install. I've tried to use the overriding process which I'm already using with success for other bundles to update from the current version in karaf 3.58 to this one but i't not working. I suspect that this is happening because fileinstall comes from framework feature i.e. https://github.com/apache/karaf/blob/master/assemblies/features/framework/src/main/feature/feature.xml. I'm using the following lines in overrides.properties: #https://issues.apache.org/jira/browse/FELIX-5125 mvn:org.apache.felix/org.apache.felix.fileinstall/3.6.4;range=[3.5.8,3.6.0) Now for the second issue. I'm using multiple factory configurations and what is happening is that they are being duplicated. Since the factory configuration exists in both nodes they are synced and I end up with one additional factory per number of replicas in this case I get two: In node 1: org.apache.sling.commons.threads.impl.DefaultThreadPool.factory.df62f050-9588-4860-ac1f-121c1633f9a1.config (originally on node 1) org.apache.sling.commons.threads.impl.DefaultThreadPool.factory.e5c5ddfb-0803-4307-9c06-cc6a142475ca.config (from node 2) In this case in particular I also get a jmx whiteboard exception since I'm registering the same MBean 2018-03-10T09:23:10,639 | ERROR | CM Event Dispatcher (Fire ConfigurationEvent: pid=platform.modules.invocationmanager.provider) | MBeanHolder | 119 - org.apache.aries.jmx.whiteboard - 1.1.5 | register: Failure registering MBean org.apache.sling.commons.threads.impl.ThreadPoolMBeanImpl@3692f4ac org.apache.sling.commons.threads.impl.ThreadPoolMBeanImpl@3692f4ac javax.management.InstanceAlreadyExistsException: org.apache.sling:type=threads,service=ThreadPool,name=default at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) ~[?:?] The only workaround that I see is to use a star configuration for cellar with only one producer of events but a fully clustered (multiple masters) configuration does not seem to work right now. Regards, Ivo Leitão -- Sent from: http://karaf.922171.n3.nabble.com/Karaf-User-f930749.html