[ https://issues.apache.org/jira/browse/KARAF-5798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537555#comment-16537555 ]
Matthew Zipay commented on KARAF-5798: -------------------------------------- I have a fix tested against 4.2.1-SNAPSHOT that includes the following: * consistent way to obtain the process ID * write the _karaf.pid_ file as soon as the process launches (so it is correct for either a master or slave instance) * new unit tests to verify that _karaf.pid_ gets written whether or not the lock is acquired * wait until lock acquisition before updating the instance PID (so it reflects the current running master) PR is [https://github.com/apache/karaf/pull/542] I need these changes specifically against Karaf 4.0.9, so I have also applied the same changes in a branch based on the 4.0.9 tag. Do you want that as a PR as well or no? > Karaf slave instance does not write pid or port file until it becomes master > ---------------------------------------------------------------------------- > > Key: KARAF-5798 > URL: https://issues.apache.org/jira/browse/KARAF-5798 > Project: Karaf > Issue Type: Bug > Components: karaf-boot > Affects Versions: 4.0.9 > Reporter: Matthew Zipay > Assignee: Jean-Baptiste Onofré > Priority: Major > > In a Karaf master/slave environment, the slave process does not write its pid > or port file until it acquires the lock and becomes the master. > I am running Karaf 4.0.9 (ServiceMix 7.0.1). > Karaf is configured as master/slave using the following from > system.properties. Master and slave are on different physical nodes. > {code:java} > karaf.lock=true > karaf.lock.class=org.apache.karaf.main.lock.OracleJDBCLock > karaf.lock.level=79 > karaf.lock.delay=10000 > karaf.lock.jdbc.url=jdbc:oracle:thin:#REMOVED# > karaf.lock.jdbc.driver=oracle.jdbc.driver.OracleDriver > karaf.lock.jdbc.user=#REMOVED# > karaf.lock.jdbc.password=#REMOVED# > karaf.lock.jdbc.table=KARAF_LOCK > karaf.lock.jdbc.clustername=karaf > karaf.lock.jdbc.timeout=30 > karaf.lock.slave.block=false > {code} > Attempting to stop the slave Karaf process results in _"Can't connect to the > container. The container is not running."_ This is not true, as a simple {{ps > -ef | grep karaf}} confirms that it is in fact running. I am able to enter > the Karaf shell just fine, use the web console, etc. > I have confirmed through multiple tests that the pid and port files don't get > written until the master lock is acquired. > Steps: > # With the Karaf slave node not started, note the pid and port files do not > exist (or contain outdated values from a previous process). > # Start the Karaf slave process. > # Note that the pid and port files have not been written. > # Stop the master process. > # Observe the slave process acquire the lock and become master. > # Note that the pid and port files have now been written. -- This message was sent by Atlassian JIRA (v7.6.3#76005)