[ 
https://issues.apache.org/jira/browse/KARAF-5798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16533163#comment-16533163
 ] 

Matthew Zipay commented on KARAF-5798:
--------------------------------------

Still testing, and I need to amend my workaround - it does *not* work 
completely. Finding the process ID of the slave instance is easy enough, but I 
had assumed (incorrectly) that the listening port I saw through lsof was the 
shutdown port - it is something else.

So I'm back to looking at the lockAcquired callback and how it writes the pid 
and port files. My question now is:

_Why_ does Karaf wait until it is the master before it writes the pid file, 
starts the shutdown thread, and writes the port file?

At least in the case of the pid file, this information is known by 
RuntimeMXBean well in advance of the instance grabbing the lock, so I still 
think that should be written even by a slave.

Is there something "special" about the shutdown thread that would make it 
impossible or inadvisable for a slave to start it? It looks like it just 
listens for the shutdown command on a socket and calls {{Framework.stop()}}... 
is that somehow problematic for a slave instance?

> Karaf slave instance does not write pid or port file until it becomes master
> ----------------------------------------------------------------------------
>
>                 Key: KARAF-5798
>                 URL: https://issues.apache.org/jira/browse/KARAF-5798
>             Project: Karaf
>          Issue Type: Bug
>          Components: karaf-boot
>    Affects Versions: 4.0.9
>            Reporter: Matthew Zipay
>            Assignee: Jean-Baptiste Onofré
>            Priority: Major
>
> In a Karaf master/slave environment, the slave process does not write its pid 
> or port file until it acquires the lock and becomes the master.
> I am running Karaf 4.0.9 (ServiceMix 7.0.1).
> Karaf is configured as master/slave using the following from 
> system.properties. Master and slave are on different physical nodes.
> {code:java}
> karaf.lock=true
> karaf.lock.class=org.apache.karaf.main.lock.OracleJDBCLock
> karaf.lock.level=79
> karaf.lock.delay=10000
> karaf.lock.jdbc.url=jdbc:oracle:thin:#REMOVED#
> karaf.lock.jdbc.driver=oracle.jdbc.driver.OracleDriver
> karaf.lock.jdbc.user=#REMOVED#
> karaf.lock.jdbc.password=#REMOVED#
> karaf.lock.jdbc.table=KARAF_LOCK
> karaf.lock.jdbc.clustername=karaf
> karaf.lock.jdbc.timeout=30
> karaf.lock.slave.block=false
> {code}
> Attempting to stop the slave Karaf process results in _"Can't connect to the 
> container. The container is not running."_ This is not true, as a simple {{ps 
> -ef | grep karaf}} confirms that it is in fact running. I am able to enter 
> the Karaf shell just fine, use the web console, etc.
> I have confirmed through multiple tests that the pid and port files don't get 
> written until the master lock is acquired.
> Steps:
> # With the Karaf slave node not started, note the pid and port files do not 
> exist (or contain outdated values from a previous process).
> # Start the Karaf slave process.
> # Note that the pid and port files have not been written.
> # Stop the master process.
> # Observe the slave process acquire the lock and become master.
> # Note that the pid and port files have now been written.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to