Hello,

Slider version 0.80 with CDH 5.5.1

Investigating a instance where Slider application errored out.

slider.agent.log  for many components show  following trace - noticed that
the "hostname" key is actually the container name e.g. "hostname":
"container_e14_1513412386901_898934_03_000003___abc". The fqdn shows
correct FQDN

Any idea? Why would the connection be refused ? The target of the EXECUTE
command had started correctly and is on same node.

Thanks,

WARNING 2018-02-24 02:28:53,933 Controller.py:628 - Request failed! Data:
{"package": "", "nodeStatus": {"status": "HEALTHY", "cause": "NONE"},
"timestamp": 1519439333928, "hostname":
"container_e14_1513412386901_898934_03_000003___abc", "responseId": 6,
"fqdn": "<correct host name>", "reports": [{"status": "COMPLETED",
"stderr": "None", "stdout": "2018-02-24 02:28:23,800 - Execute['XYZ']
{'pid_file':
'/hadoop/disk10/yarn/logs/application_1513412386901_898934/container_e14_1513412386901_898934_03_000003/abc',
'wait_for_finish': False, 'logoutput': True, 'poll_after': 30}",
"clusterName": "foo", "structuredOut": "{}", "allocatedPorts": {},
"roleCommand": "START", "serviceName": "foo", "role": "abc", "actionId":
"15-1", "taskId": 15, "exitcode": 0}]}
INFO 2018-02-24 02:29:16,056 security.py:89 - SSL Connect being called..
connecting to the server
ERROR 2018-02-24 02:29:16,057 Controller.py:625 - Exception raised
Traceback (most recent call last):
  File
"/hadoop/disk1/yarn/local/usercache/xxx/appcache/application_1513412386901_898934/filecache/10/slider-agent.tar.gz/slider-agent/agent/Controller.py",
line 619, in sendRequest
    self.cachedconnect = security.CachedHTTPSConnection(self.config)
  File
"/hadoop/disk1/yarn/local/usercache/xxx/appcache/application_1513412386901_898934/filecache/10/slider-agent.tar.gz/slider-agent/agent/security.py",
line 106, in __init__
    self.connect()
  File
"/hadoop/disk1/yarn/local/usercache/xxx/appcache/application_1513412386901_898934/filecache/10/slider-agent.tar.gz/slider-agent/agent/security.py",
line 111, in connect
    self.httpsconn.connect()
  File
"/hadoop/disk1/yarn/local/usercache/xxx/appcache/application_1513412386901_898934/filecache/10/slider-agent.tar.gz/slider-agent/agent/security.py",
line 49, in connect
    sock=self.create_connection()
  File
"/hadoop/disk1/yarn/local/usercache/xxx/appcache/application_1513412386901_898934/filecache/10/slider-agent.tar.gz/slider-agent/agent/security.py",
line 90, in create_connection
    sock = socket.create_connection((self.host, self.port), 60)
  File "/usr/lib64/python2.6/socket.py", line 567, in create_connection
    raise error, msg
error: [Errno 111] Connection refused

Reply via email to