[ 
https://issues.apache.org/jira/browse/AMBARI-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ritesh updated AMBARI-21697:
----------------------------
    Summary: Ambari showing false alert for STS connectivity while using http 
mode  (was: Spark thrift service was alerting for connectivity while using http 
mode)

> Ambari showing false alert for STS connectivity while using http mode
> ---------------------------------------------------------------------
>
>                 Key: AMBARI-21697
>                 URL: https://issues.apache.org/jira/browse/AMBARI-21697
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.5.1
>            Reporter: Ritesh
>         Attachments: AMBARI-999.patch
>
>
> Newly installed clusters keep showing ambari thrift server down alert while 
> using http mode.
> An alert for spark thrift service is seen everytime new cluster is created. 
> The script used by alert is 
> /var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package/scripts/alerts/alert_spark2_thrift_port.py
> Error stack 
> =======
> Connection failed on host 
> hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016 
> (Traceback (most recent call last): 
> File 
> "/var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package/scripts/alerts/alert_spark2_thrift_port.py",
>  line 144, in execute 
> Execute(cmd, user=hiveruser, path=[beeline_cmd], 
> timeout=CHECK_COMMAND_TIMEOUT_DEFAULT) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", 
> line 155, in _init_ 
> self.env.run() 
> File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 160, in run 
> self.run_action(resource, action) 
> File 
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", 
> line 124, in run_action 
> provider_action() 
> File 
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
>  line 262, in action_run 
> tries=self.resource.tries, try_sleep=self.resource.try_sleep) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 72, in inner 
> result = function(command, **kwargs) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 102, in checked_call 
> tries=tries, try_sleep=try_sleep, 
> timeout_kill_strategy=timeout_kill_strategy) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 150, in _call_wrapper 
> result = _call(command, **kwargs_copy) 
> File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", 
> line 303, in _call 
> raise ExecutionFailed(err_msg, code, out, err)
> *ExecutionFailed: Execution of '! beeline -u 
> 'jdbc:hive2://hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016/default'
>  transportMode=http -e '' 2>&1| awk '
> {print}
> '|grep -i -e 'Connection refused' -e 'Invalid URL'' returned 1. Error: Could 
> not open client transport with JDBC Uri: 
> jdbc:hive2://hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016/default:
>  java.net.ConnectException: Connection refused (Connection refused) 
> (state=08S01,code=0)*
> Error: Could not open client transport with JDBC Uri: 
> jdbc:hive2://hn0-salqa0.lv5aupozrfhezhozcxr3xjcwqe.dx.internal.cloudapp.net:10016/default:
>  java.net.ConnectException: Connection refused (Connection refused) 
> (state=08S01,code=0)
> It seems that alert is checking wrong port (10016 instead of 10002) when 
> configured in http mode (transportMode=http).
> Reason
> =====
> From the logic in the script it seems that if the transport mode is binary it 
> will use HIVE_SERVER_THRIFT_PORT which is same as of THRIFT_PORT_DEFAULT. 
> Hence it will always go for 10016 port. 
> ============
> THRIFT_PORT_DEFAULT = 10016
> HIVE_SERVER_TRANSPORT_MODE_DEFAULT = 'binary'
> port = THRIFT_PORT_DEFAULT
> if transport_mode.lower() == 'binary' and HIVE_SERVER_THRIFT_PORT_KEY in 
> configurations:
> port = int(configurations[HIVE_SERVER_THRIFT_PORT_KEY])
> ========
> Resolution 
> We should change the default port to 10002 in the alert script. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to