[
https://issues.apache.org/jira/browse/AMBARI-22834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16336125#comment-16336125
]
Kevin Risden commented on AMBARI-22834:
---------------------------------------
This is from the ambari-agent log when checking for the pid file:
{code:bash}
INFO 2018-01-16 16:00:05,500 logger.py:75 - Process with pid 257987 is not
running. Stale pid file at
/var/run/zeppelin/zeppelin-interpreter-livy-zeppelin-HOSTNAME.pid
ERROR 2018-01-16 16:00:05,501 script_alert.py:123 -
[Alert][zeppelin_server_status] Failed with result CRITICAL: ['']
{code}
This picked up the wrong pid file and the livy interpreter had been
stopped/restarted.
> Zeppelin Alert checks wrong pid file
> ------------------------------------
>
> Key: AMBARI-22834
> URL: https://issues.apache.org/jira/browse/AMBARI-22834
> Project: Ambari
> Issue Type: Bug
> Components: alerts
> Affects Versions: trunk, 2.6.2
> Reporter: Kevin Risden
> Priority: Minor
>
> The Zeppelin alert check doesn't check hte actual Zeppelin pid file. Instead
> it can pick up interpreter pid files.
> {code:python}
> pid_file = glob.glob(zeppelin_pid_dir + '/zeppelin-*.pid')[0]
> {code}
> This is wrong when there are multiple files in the pid dir starting with
> "zeppelin-".
> {code:bash}
> ls -l /var/run/zeppelin/
> -rw-r--r-- 1 zeppelin hadoop 7 Jan 16 12:01
> zeppelin-interpreter-livy-zeppelin-HOSTNAME.pid
> -rw-r--r-- 1 zeppelin hadoop 7 Jan 16 11:56 zeppelin-zeppelin-HOSTNAME.pid
> {code}
> *
> [https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/common-services/ZEPPELIN/0.6.0/package/scripts/alert_check_zeppelin.py]
> *
> [https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/common-services/ZEPPELIN/0.7.0/package/scripts/alert_check_zeppelin.py]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)