-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53633/
-----------------------------------------------------------
Review request for Ambari, Dmytro Sen, Sumit Mohanty, and Sid Wagle.
Bugs: AMBARI1-18841
https://issues.apache.org/jira/browse/AMBARI1-18841
Repository: ambari
Description
-------
Grafana fails to start with the below error
{code}
Traceback (most recent call last):
File
"/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_grafana.py",
line 69, in <module>
AmsGrafana().execute()
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 280, in execute
method(env)
File
"/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_grafana.py",
line 47, in start
not_if = params.grafana_process_exists_cmd,
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 155, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 160, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 124, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 262, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
line 102, in checked_call
tries=tries, try_sleep=try_sleep,
timeout_kill_strategy=timeout_kill_strategy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py",
line 303, in _call
raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of
'/usr/sbin/ambari-metrics-grafana start' returned 1. ######## Hortonworks
#############
This is MOTD message, added for testing in qe infra
Starting Ambari Metrics Grafana: .... FAILED
{code}
PROBLEM
Grafana Start fails intermittently. This is because at times Grafana DB starts
up slower than the Grafana Server startup. Hence, Grafana server scripts give
up and exit as error, with a running Grafana Server instance. The DB starts up
fine sometime after that. Since the PID is not logged correctly, Ambari assumes
Grafana Server is down. Therefore START Grafana fails from Ambari because of an
already running Grafana instance.
FIX
On Ambari's end, kill any running instance of Grafana while starting up
Grafana. In the worst case of slow DB startup, the second START will succeed.
Diffs
-----
ambari-metrics/ambari-metrics-grafana/conf/unix/ambari-metrics-grafana
f0c2ed4
ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_grafana.py
0c9bb08
Diff: https://reviews.apache.org/r/53633/diff/
Testing
-------
Manually tested START, RESTART & STOP.
Python unit tests pass.
Thanks,
Aravindan Vijayan