-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53633/
-----------------------------------------------------------

Review request for Ambari, Dmytro Sen, Sumit Mohanty, and Sid Wagle.


Bugs: AMBARI1-18841
    https://issues.apache.org/jira/browse/AMBARI1-18841


Repository: ambari


Description
-------

Grafana fails to start with the below error

{code}
Traceback (most recent call last):
  File 
"/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_grafana.py",
 line 69, in <module>
    AmsGrafana().execute()
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
 line 280, in execute
    method(env)
  File 
"/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_grafana.py",
 line 47, in start
    not_if = params.grafana_process_exists_cmd,
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
line 155, in __init__
    self.env.run()
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
line 160, in run
    self.run_action(resource, action)
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
line 124, in run_action
    provider_action()
  File 
"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
 line 262, in action_run
    tries=self.resource.tries, try_sleep=self.resource.try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 72, in inner
    result = function(command, **kwargs)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 102, in checked_call
    tries=tries, try_sleep=try_sleep, 
timeout_kill_strategy=timeout_kill_strategy)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
line 303, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 
'/usr/sbin/ambari-metrics-grafana start' returned 1. ######## Hortonworks 
#############
This is MOTD message, added for testing in qe infra
Starting Ambari Metrics Grafana: .... FAILED
{code}

PROBLEM
Grafana Start fails intermittently. This is because at times Grafana DB starts 
up slower than the Grafana Server startup. Hence, Grafana server scripts give 
up and exit as error, with a running Grafana Server instance. The DB starts up 
fine sometime after that. Since the PID is not logged correctly, Ambari assumes 
Grafana Server is down. Therefore START Grafana fails from Ambari because of an 
already running Grafana instance.

FIX
On Ambari's end, kill any running instance of Grafana while starting up 
Grafana. In the worst case of slow DB startup, the second START will succeed.


Diffs
-----

  ambari-metrics/ambari-metrics-grafana/conf/unix/ambari-metrics-grafana 
f0c2ed4 
  
ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_grafana.py
 0c9bb08 

Diff: https://reviews.apache.org/r/53633/diff/


Testing
-------

Manually tested START, RESTART & STOP.

Python unit tests pass.


Thanks,

Aravindan Vijayan

Reply via email to