Aled Sage created BROOKLYN-243:
----------------------------------
Summary: MySql stop+restart: timed out waiting for serviceUp (due
to enrichers/feeds?)
Key: BROOKLYN-243
URL: https://issues.apache.org/jira/browse/BROOKLYN-243
Project: Brooklyn
Issue Type: Bug
Reporter: Aled Sage
Using Brooklyn 0.9.0-SNAPSHOT, I deployed MySqlNode to a BYON VM in AWS (on
CentOS 6.5).
My automated test script invoked stop() on the MySqlNode to just stop the
process, and then invoked restart().
The restart() successfully restarted the process, but then the post-restart
task timed out waiting for SERVICE_UP.
Looking at the sensor values, I think (*) it showed:
{noformat}
mysql.queries.perSec.fromMysql: 0.29
service.process.isRunning: true
service.state: STARTING
service.isUp: false
service.notUp.indicators: {}
{noformat}
(*) unfortunately the automated test script changed the state of the entity
before I had copy-pasted all the values. But I'm pretty sure it was in this
state.
This suggests that the feed was doing its job (having populated isRunning and
queries.perSec) - the log confirmed that this was being executed periodically.
It suggests that the notUp.indicators had been updated correctly by the
enricher.
But that the
{{ServiceNotUpLogic.newEnricherForServiceUpIfNotUpIndicatorsEmpty()}} had
somehow not set the serviceUp.
This is very surprising! The entity was previously up; the enricher has been
there for a while. I therefore don't think it's a race with the first value
being missed or anything like that.
A (probably unrelated) worry I have about this code is for stop(): we stop the
feeds (but we don't wait for the feeds to be terminated), and then set
isRunning to false. There is a race, where we could leave the entity saying
isRunning=true even though the process is stopped.
This is not reproducible; I've only ever seen it once.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)