[
https://issues.apache.org/jira/browse/BROOKLYN-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aled Sage resolved BROOKLYN-484.
--------------------------------
Resolution: Fixed
Fix Version/s: 0.12.0
Fixed in https://github.com/apache/brooklyn-library/pull/101
> JBoss7 entity restart fails (launch ssh session returns before process
> running?)
> --------------------------------------------------------------------------------
>
> Key: BROOKLYN-484
> URL: https://issues.apache.org/jira/browse/BROOKLYN-484
> Project: Brooklyn
> Issue Type: Bug
> Reporter: Aled Sage
> Priority: Minor
> Fix For: 0.12.0
>
>
> With version 0.11.0-rc1...
> We've seen a failure of the {{restart}} effector for {{JBoss7Server}}. The
> post-launch step failed (waiting for the url to be reachable/responsive).
> Unfortunately there's no additional debugging information available - the VMs
> are gone, and the debug log is not available.
> However, I've identified a reason why this might happen.
> On {{start}}, the {{JBoss7SshDriver.launch}} script will redirect
> stdout/stderr to a file named {{console}}, and will then wait for that file
> to say 'starting'.
> Importantly, there is an old comment saying:
> {noformat}
> // We wait for evidence of JBoss running because, using SshCliTool,
> // we saw the ssh session return before the JBoss process was fully
> running
> // so the process failed to start.
> {noformat}
> On {{restart}}, it stops the process, and then calls
> {{JBoss7SshDriver.launch}} again. However, it appends to the file
> {{console}}. Therefore when it checks if the file says 'starting' it will
> return immediately. This means the ssh session could return before the JBoss
> process was fully running.
> A solution would be to change launch, to first move the previous {{console}}
> file. This would mean the subsequent calls to the {{launch}} script would
> wait for the process to be running.
> This same problem would also apply to other entities, such as
> {{TomcatSshDriver.launch}}.
> A way to reproduce this would probably be to repeatedly call the {{restart}}
> effector (waiting for serviceUp to be true again between each). It almost
> always works - I've personally only seen this failure once.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)