Gour Saha created SLIDER-333:
--------------------------------
Summary: Slider fails to create/start an application when 2-way
SSL is enabled
Key: SLIDER-333
URL: https://issues.apache.org/jira/browse/SLIDER-333
Project: Slider
Issue Type: Bug
Affects Versions: Slider 0.40
Reporter: Gour Saha
Enabled 2-way SSL by setting "ssl.server.client.auth" to "true" in
appConfig.json, but am seeing the following errors in AM and agent logs. This
is happening in the latest develop branch.
AM Log:
{noformat}
14/08/18 16:46:54 INFO appmaster.SliderAppMaster: Connecting to RM at
38306,address tracking URL=http://c6409.ambari.apache.org:38315
14/08/18 16:46:54 INFO agent.AgentUtils: Reading metainfo at
apache-slider-hbase-0.98.3-hadoop2-app-package-0.31.0-incubating-SNAPSHOT.zip
14/08/18 16:46:54 INFO tools.SliderUtils: Reading metainfo.xml of size 3802
14/08/18 16:46:54 INFO agent.HeartbeatMonitor: Starting heartbeat monitor with
interval 60000
14/08/18 16:46:54 INFO state.AppState: Adding new role HBASE_REGIONSERVER
14/08/18 16:46:54 INFO state.AppState: Role HBASE_REGIONSERVER assigned
priority 2
14/08/18 16:46:54 INFO state.AppState: Adding new role HBASE_MASTER
14/08/18 16:46:54 INFO state.AppState: Role HBASE_MASTER assigned priority 1
14/08/18 16:46:54 INFO state.AppState: Role slider-appmaster flexed from 0 to 1
14/08/18 16:46:54 INFO state.AppState: Role HBASE_REGIONSERVER flexed from 0 to
1
14/08/18 16:46:54 INFO state.AppState: Role HBASE_MASTER flexed from 0 to 1
14/08/18 16:46:54 INFO state.RoleHistory: Role history bootstrapped
14/08/18 16:46:54 INFO appmaster.SliderAppMaster: service instances already
running: []
14/08/18 16:46:55 INFO curator.RegistryBinderService: registering
ServiceInstance{name='org-apache-slider', id='cl1', address='192.168.64.109',
port=38315, sslPort=null, payload=ServiceInstanceData{id='cl1',
serviceType='org-apache-slider'}, registrationTimeUTC=1408380415215,
serviceType=DYNAMIC, uriSpec=org.apache.curator.x.discovery.UriSpec@bd91710d}
14/08/18 16:46:55 INFO curator.RegistryBinderService: registration completed
ServiceInstance{name='org-apache-slider', id='cl1', address='192.168.64.109',
port=38315, sslPort=null, payload=ServiceInstanceData{id='cl1',
serviceType='org-apache-slider'}, registrationTimeUTC=1408380415215,
serviceType=DYNAMIC, uriSpec=org.apache.curator.x.discovery.UriSpec@bd91710d}
14/08/18 16:46:55 INFO appmaster.SliderAppMaster: Chaos monkey disabled
14/08/18 16:46:55 INFO appmaster.SliderAppMaster: Adding Chaos Monkey scheduled
every 0 seconds (0 hours)
14/08/18 16:46:55 INFO workflow.WorkflowCompositeService: Child service
completed Service SliderAMProviderService in state SliderAMProviderService:
STOPPED; current service null; queued service count=0
14/08/18 16:46:55 INFO appmaster.SliderAppMaster: Process has exited with exit
code 0 mapped to 0 -ignoring
14/08/18 16:46:55 INFO state.AppState: RoleStatus{name='HBASE_REGIONSERVER',
key=2, desired=1, actual=0, requested=0, releasing=0, failed=0, started=0,
startFailed=0, completed=0, failureMessage=''}
14/08/18 16:46:55 INFO state.AppState: HBASE_REGIONSERVER: Asking for 1 more
nodes(s) for a total of 1
14/08/18 16:46:55 INFO state.RoleHistory: There're 0 nodes to consider for
HBASE_REGIONSERVER
14/08/18 16:46:55 INFO state.AppState: Container ask is Capability[<memory:256,
vCores:1>]Priority[1073741826]
14/08/18 16:46:55 INFO state.AppState: RoleStatus{name='HBASE_MASTER', key=1,
desired=1, actual=0, requested=0, releasing=0, failed=0, started=0,
startFailed=0, completed=0, failureMessage=''}
14/08/18 16:46:55 INFO state.AppState: HBASE_MASTER: Asking for 1 more nodes(s)
for a total of 1
14/08/18 16:46:55 INFO state.RoleHistory: There're 0 nodes to consider for
HBASE_MASTER
14/08/18 16:46:55 INFO state.AppState: Container ask is Capability[<memory:256,
vCores:1>]Priority[1073741825]
14/08/18 16:46:57 INFO impl.AMRMClientImpl: Received new token for :
c6409.ambari.apache.org:45454
14/08/18 16:46:57 INFO appmaster.SliderAppMaster: onContainersAllocated(1)
14/08/18 16:46:57 INFO state.AppState: Assigning role HBASE_MASTER to container
container_1407891977820_0028_01_000002, on c6409.ambari.apache.org:45454,
14/08/18 16:46:57 INFO appmaster.SliderAppMaster: Diagnostics:
RoleStatus{name='slider-appmaster', key=0, desired=1, actual=0, requested=0,
releasing=0, failed=0, started=0, startFailed=0, completed=0, failureMessage=''}
RoleStatus{name='HBASE_REGIONSERVER', key=2, desired=1, actual=0, requested=1,
releasing=0, failed=0, started=0, startFailed=0, completed=0, failureMessage=''}
RoleStatus{name='HBASE_MASTER', key=1, desired=1, actual=1, requested=0,
releasing=0, failed=0, started=0, startFailed=0, completed=0, failureMessage=''}
14/08/18 16:46:57 INFO agent.AgentProviderService: Build launch context for
Agent
14/08/18 16:46:57 INFO agent.AgentProviderService: AGENT_WORK_ROOT set to $PWD
14/08/18 16:46:57 INFO agent.AgentProviderService: AGENT_LOG_ROOT set to
$LOG_DIRS
14/08/18 16:46:57 INFO agent.AgentProviderService: PYTHONPATH set to
./infra/agent/slider-agent/
14/08/18 16:46:57 INFO agent.AgentProviderService: Using
./infra/agent/slider-agent/agent/main.py for agent.
14/08/18 16:46:57 INFO appmaster.RoleLaunchService: Starting container with
command: python ./infra/agent/slider-agent/agent/main.py --label
container_1407891977820_0028_01_000002___HBASE_MASTER --zk-quorum
c6409.ambari.apache.org:2181 --zk-reg-path /registry/org-apache-slider/cl1 ;
14/08/18 16:46:57 INFO impl.NMClientAsyncImpl: Processing Event EventType:
START_CONTAINER for Container container_1407891977820_0028_01_000002
14/08/18 16:46:57 INFO impl.ContainerManagementProtocolProxy: Opening proxy :
c6409.ambari.apache.org:45454
14/08/18 16:46:57 INFO appmaster.SliderAppMaster: Started Container
container_1407891977820_0028_01_000002
14/08/18 16:46:57 INFO appmaster.SliderAppMaster: Deployed instance of role
HBASE_MASTER onto container_1407891977820_0028_01_000002
14/08/18 16:46:57 INFO appmaster.SliderAppMaster: Registering component
container_1407891977820_0028_01_000002
14/08/18 16:46:57 INFO impl.NMClientAsyncImpl: Processing Event EventType:
QUERY_CONTAINER for Container container_1407891977820_0028_01_000002
14/08/18 16:46:58 INFO appmaster.SliderAppMaster: onContainersAllocated(1)
14/08/18 16:46:58 INFO state.AppState: Assigning role HBASE_REGIONSERVER to
container container_1407891977820_0028_01_000003, on
c6409.ambari.apache.org:45454,
14/08/18 16:46:58 INFO appmaster.SliderAppMaster: Diagnostics:
RoleStatus{name='slider-appmaster', key=0, desired=1, actual=0, requested=0,
releasing=0, failed=0, started=0, startFailed=0, completed=0, failureMessage=''}
RoleStatus{name='HBASE_REGIONSERVER', key=2, desired=1, actual=1, requested=0,
releasing=0, failed=0, started=0, startFailed=0, completed=0, failureMessage=''}
RoleStatus{name='HBASE_MASTER', key=1, desired=1, actual=1, requested=0,
releasing=0, failed=0, started=1, startFailed=0, completed=0, failureMessage=''}
14/08/18 16:46:58 INFO agent.AgentProviderService: Build launch context for
Agent
14/08/18 16:46:58 INFO agent.AgentProviderService: AGENT_WORK_ROOT set to $PWD
14/08/18 16:46:58 INFO agent.AgentProviderService: AGENT_LOG_ROOT set to
$LOG_DIRS
14/08/18 16:46:58 INFO agent.AgentProviderService: PYTHONPATH set to
./infra/agent/slider-agent/
14/08/18 16:46:58 INFO agent.AgentProviderService: Using
./infra/agent/slider-agent/agent/main.py for agent.
14/08/18 16:46:58 INFO appmaster.RoleLaunchService: Starting container with
command: python ./infra/agent/slider-agent/agent/main.py --label
container_1407891977820_0028_01_000003___HBASE_REGIONSERVER --zk-quorum
c6409.ambari.apache.org:2181 --zk-reg-path /registry/org-apache-slider/cl1 ;
14/08/18 16:46:58 INFO impl.NMClientAsyncImpl: Processing Event EventType:
START_CONTAINER for Container container_1407891977820_0028_01_000003
14/08/18 16:46:58 INFO appmaster.SliderAppMaster: Started Container
container_1407891977820_0028_01_000003
14/08/18 16:46:58 INFO appmaster.SliderAppMaster: Deployed instance of role
HBASE_REGIONSERVER onto container_1407891977820_0028_01_000003
14/08/18 16:46:58 INFO appmaster.SliderAppMaster: Registering component
container_1407891977820_0028_01_000003
14/08/18 16:46:58 INFO impl.NMClientAsyncImpl: Processing Event EventType:
QUERY_CONTAINER for Container container_1407891977820_0028_01_000003
14/08/18 16:47:00 WARN mortbay.log: javax.net.ssl.SSLHandshakeException: null
cert chain
14/08/18 16:47:00 WARN mortbay.log: javax.net.ssl.SSLHandshakeException: null
cert chain
14/08/18 16:47:10 WARN mortbay.log: javax.net.ssl.SSLHandshakeException: null
cert chain
{noformat}
Agent Log:
{noformat}
INFO 2014-08-18 16:46:59,525 main.py:85 - loglevel=logging.INFO
INFO 2014-08-18 16:46:59,525 main.py:245 - Using AGENT_WORK_ROOT =
/hadoop/yarn/local/usercache/yarn/appcache/application_1407891977820_0028/container_1407891977820_0028_01_000002
INFO 2014-08-18 16:46:59,526 main.py:246 - Using AGENT_LOG_ROOT =
/hadoop/yarn/log/application_1407891977820_0028/container_1407891977820_0028_01_000002
INFO 2014-08-18 16:46:59,526 main.py:256 - Connecting to the server at:
https://c6409.ambari.apache.org:55659/ws/v1/slider/agents/
INFO 2014-08-18 16:46:59,526 NetUtil.py:67 - DEBUG: Trying to connect to the
server at https://c6409.ambari.apache.org:55659/ws/v1/slider/agents/
INFO 2014-08-18 16:46:59,526 NetUtil.py:38 - Connecting to the following url
https://c6409.ambari.apache.org:55659/ws/v1/slider/agents/
ERROR 2014-08-18 16:47:00,013 NetUtil.py:52 - [Errno 8] _ssl.c:490: EOF
occurred in violation of protocol
ERROR 2014-08-18 16:47:00,014 NetUtil.py:54 - SSLError: Failed to connect.
Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
INFO 2014-08-18 16:47:00,014 NetUtil.py:76 - Server at
https://c6409.ambari.apache.org:55659/ws/v1/slider/agents/ is not reachable,
sleeping for 10 seconds...
INFO 2014-08-18 16:47:10,024 NetUtil.py:38 - Connecting to the following url
https://c6409.ambari.apache.org:55659/ws/v1/slider/agents/
ERROR 2014-08-18 16:47:10,310 NetUtil.py:52 - [Errno 8] _ssl.c:490: EOF
occurred in violation of protocol
ERROR 2014-08-18 16:47:10,310 NetUtil.py:54 - SSLError: Failed to connect.
Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
{noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)