Gour Saha created SLIDER-333:
--------------------------------

             Summary: Slider fails to create/start an application when 2-way 
SSL is enabled
                 Key: SLIDER-333
                 URL: https://issues.apache.org/jira/browse/SLIDER-333
             Project: Slider
          Issue Type: Bug
    Affects Versions: Slider 0.40
            Reporter: Gour Saha


Enabled 2-way SSL by setting "ssl.server.client.auth" to "true" in 
appConfig.json, but am seeing the following errors in AM and agent logs. This 
is happening in the latest develop branch.

AM Log:
{noformat}
14/08/18 16:46:54 INFO appmaster.SliderAppMaster: Connecting to RM at 
38306,address tracking URL=http://c6409.ambari.apache.org:38315
14/08/18 16:46:54 INFO agent.AgentUtils: Reading metainfo at 
apache-slider-hbase-0.98.3-hadoop2-app-package-0.31.0-incubating-SNAPSHOT.zip
14/08/18 16:46:54 INFO tools.SliderUtils: Reading metainfo.xml of size 3802
14/08/18 16:46:54 INFO agent.HeartbeatMonitor: Starting heartbeat monitor with 
interval 60000
14/08/18 16:46:54 INFO state.AppState: Adding new role HBASE_REGIONSERVER
14/08/18 16:46:54 INFO state.AppState: Role HBASE_REGIONSERVER assigned 
priority 2
14/08/18 16:46:54 INFO state.AppState: Adding new role HBASE_MASTER
14/08/18 16:46:54 INFO state.AppState: Role HBASE_MASTER assigned priority 1
14/08/18 16:46:54 INFO state.AppState: Role slider-appmaster flexed from 0 to 1
14/08/18 16:46:54 INFO state.AppState: Role HBASE_REGIONSERVER flexed from 0 to 
1
14/08/18 16:46:54 INFO state.AppState: Role HBASE_MASTER flexed from 0 to 1
14/08/18 16:46:54 INFO state.RoleHistory: Role history bootstrapped
14/08/18 16:46:54 INFO appmaster.SliderAppMaster: service instances already 
running: []
14/08/18 16:46:55 INFO curator.RegistryBinderService: registering 
ServiceInstance{name='org-apache-slider', id='cl1', address='192.168.64.109', 
port=38315, sslPort=null, payload=ServiceInstanceData{id='cl1', 
serviceType='org-apache-slider'}, registrationTimeUTC=1408380415215, 
serviceType=DYNAMIC, uriSpec=org.apache.curator.x.discovery.UriSpec@bd91710d}
14/08/18 16:46:55 INFO curator.RegistryBinderService: registration completed 
ServiceInstance{name='org-apache-slider', id='cl1', address='192.168.64.109', 
port=38315, sslPort=null, payload=ServiceInstanceData{id='cl1', 
serviceType='org-apache-slider'}, registrationTimeUTC=1408380415215, 
serviceType=DYNAMIC, uriSpec=org.apache.curator.x.discovery.UriSpec@bd91710d}
14/08/18 16:46:55 INFO appmaster.SliderAppMaster: Chaos monkey disabled
14/08/18 16:46:55 INFO appmaster.SliderAppMaster: Adding Chaos Monkey scheduled 
every 0 seconds (0 hours)
14/08/18 16:46:55 INFO workflow.WorkflowCompositeService: Child service 
completed Service SliderAMProviderService in state SliderAMProviderService: 
STOPPED; current service null; queued service count=0
14/08/18 16:46:55 INFO appmaster.SliderAppMaster: Process has exited with exit 
code 0 mapped to 0 -ignoring
14/08/18 16:46:55 INFO state.AppState: RoleStatus{name='HBASE_REGIONSERVER', 
key=2, desired=1, actual=0, requested=0, releasing=0, failed=0, started=0, 
startFailed=0, completed=0, failureMessage=''}
14/08/18 16:46:55 INFO state.AppState: HBASE_REGIONSERVER: Asking for 1 more 
nodes(s) for a total of 1
14/08/18 16:46:55 INFO state.RoleHistory: There're 0 nodes to consider for 
HBASE_REGIONSERVER
14/08/18 16:46:55 INFO state.AppState: Container ask is Capability[<memory:256, 
vCores:1>]Priority[1073741826]
14/08/18 16:46:55 INFO state.AppState: RoleStatus{name='HBASE_MASTER', key=1, 
desired=1, actual=0, requested=0, releasing=0, failed=0, started=0, 
startFailed=0, completed=0, failureMessage=''}
14/08/18 16:46:55 INFO state.AppState: HBASE_MASTER: Asking for 1 more nodes(s) 
for a total of 1
14/08/18 16:46:55 INFO state.RoleHistory: There're 0 nodes to consider for 
HBASE_MASTER
14/08/18 16:46:55 INFO state.AppState: Container ask is Capability[<memory:256, 
vCores:1>]Priority[1073741825]
14/08/18 16:46:57 INFO impl.AMRMClientImpl: Received new token for : 
c6409.ambari.apache.org:45454
14/08/18 16:46:57 INFO appmaster.SliderAppMaster: onContainersAllocated(1)
14/08/18 16:46:57 INFO state.AppState: Assigning role HBASE_MASTER to container 
container_1407891977820_0028_01_000002, on c6409.ambari.apache.org:45454,
14/08/18 16:46:57 INFO appmaster.SliderAppMaster: Diagnostics: 
RoleStatus{name='slider-appmaster', key=0, desired=1, actual=0, requested=0, 
releasing=0, failed=0, started=0, startFailed=0, completed=0, failureMessage=''}
RoleStatus{name='HBASE_REGIONSERVER', key=2, desired=1, actual=0, requested=1, 
releasing=0, failed=0, started=0, startFailed=0, completed=0, failureMessage=''}
RoleStatus{name='HBASE_MASTER', key=1, desired=1, actual=1, requested=0, 
releasing=0, failed=0, started=0, startFailed=0, completed=0, failureMessage=''}

14/08/18 16:46:57 INFO agent.AgentProviderService: Build launch context for 
Agent
14/08/18 16:46:57 INFO agent.AgentProviderService: AGENT_WORK_ROOT set to $PWD
14/08/18 16:46:57 INFO agent.AgentProviderService: AGENT_LOG_ROOT set to 
$LOG_DIRS
14/08/18 16:46:57 INFO agent.AgentProviderService: PYTHONPATH set to 
./infra/agent/slider-agent/
14/08/18 16:46:57 INFO agent.AgentProviderService: Using 
./infra/agent/slider-agent/agent/main.py for agent.
14/08/18 16:46:57 INFO appmaster.RoleLaunchService: Starting container with 
command: python ./infra/agent/slider-agent/agent/main.py --label 
container_1407891977820_0028_01_000002___HBASE_MASTER --zk-quorum 
c6409.ambari.apache.org:2181 --zk-reg-path /registry/org-apache-slider/cl1 ;
14/08/18 16:46:57 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
START_CONTAINER for Container container_1407891977820_0028_01_000002
14/08/18 16:46:57 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
c6409.ambari.apache.org:45454
14/08/18 16:46:57 INFO appmaster.SliderAppMaster: Started Container 
container_1407891977820_0028_01_000002
14/08/18 16:46:57 INFO appmaster.SliderAppMaster: Deployed instance of role 
HBASE_MASTER onto container_1407891977820_0028_01_000002
14/08/18 16:46:57 INFO appmaster.SliderAppMaster: Registering component 
container_1407891977820_0028_01_000002
14/08/18 16:46:57 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
QUERY_CONTAINER for Container container_1407891977820_0028_01_000002
14/08/18 16:46:58 INFO appmaster.SliderAppMaster: onContainersAllocated(1)
14/08/18 16:46:58 INFO state.AppState: Assigning role HBASE_REGIONSERVER to 
container container_1407891977820_0028_01_000003, on 
c6409.ambari.apache.org:45454,
14/08/18 16:46:58 INFO appmaster.SliderAppMaster: Diagnostics: 
RoleStatus{name='slider-appmaster', key=0, desired=1, actual=0, requested=0, 
releasing=0, failed=0, started=0, startFailed=0, completed=0, failureMessage=''}
RoleStatus{name='HBASE_REGIONSERVER', key=2, desired=1, actual=1, requested=0, 
releasing=0, failed=0, started=0, startFailed=0, completed=0, failureMessage=''}
RoleStatus{name='HBASE_MASTER', key=1, desired=1, actual=1, requested=0, 
releasing=0, failed=0, started=1, startFailed=0, completed=0, failureMessage=''}

14/08/18 16:46:58 INFO agent.AgentProviderService: Build launch context for 
Agent
14/08/18 16:46:58 INFO agent.AgentProviderService: AGENT_WORK_ROOT set to $PWD
14/08/18 16:46:58 INFO agent.AgentProviderService: AGENT_LOG_ROOT set to 
$LOG_DIRS
14/08/18 16:46:58 INFO agent.AgentProviderService: PYTHONPATH set to 
./infra/agent/slider-agent/
14/08/18 16:46:58 INFO agent.AgentProviderService: Using 
./infra/agent/slider-agent/agent/main.py for agent.
14/08/18 16:46:58 INFO appmaster.RoleLaunchService: Starting container with 
command: python ./infra/agent/slider-agent/agent/main.py --label 
container_1407891977820_0028_01_000003___HBASE_REGIONSERVER --zk-quorum 
c6409.ambari.apache.org:2181 --zk-reg-path /registry/org-apache-slider/cl1 ;
14/08/18 16:46:58 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
START_CONTAINER for Container container_1407891977820_0028_01_000003
14/08/18 16:46:58 INFO appmaster.SliderAppMaster: Started Container 
container_1407891977820_0028_01_000003
14/08/18 16:46:58 INFO appmaster.SliderAppMaster: Deployed instance of role 
HBASE_REGIONSERVER onto container_1407891977820_0028_01_000003
14/08/18 16:46:58 INFO appmaster.SliderAppMaster: Registering component 
container_1407891977820_0028_01_000003
14/08/18 16:46:58 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
QUERY_CONTAINER for Container container_1407891977820_0028_01_000003
14/08/18 16:47:00 WARN mortbay.log: javax.net.ssl.SSLHandshakeException: null 
cert chain
14/08/18 16:47:00 WARN mortbay.log: javax.net.ssl.SSLHandshakeException: null 
cert chain
14/08/18 16:47:10 WARN mortbay.log: javax.net.ssl.SSLHandshakeException: null 
cert chain
{noformat}

Agent Log:
{noformat}
INFO 2014-08-18 16:46:59,525 main.py:85 - loglevel=logging.INFO
INFO 2014-08-18 16:46:59,525 main.py:245 - Using AGENT_WORK_ROOT = 
/hadoop/yarn/local/usercache/yarn/appcache/application_1407891977820_0028/container_1407891977820_0028_01_000002
INFO 2014-08-18 16:46:59,526 main.py:246 - Using AGENT_LOG_ROOT = 
/hadoop/yarn/log/application_1407891977820_0028/container_1407891977820_0028_01_000002
INFO 2014-08-18 16:46:59,526 main.py:256 - Connecting to the server at: 
https://c6409.ambari.apache.org:55659/ws/v1/slider/agents/
INFO 2014-08-18 16:46:59,526 NetUtil.py:67 - DEBUG: Trying to connect to the 
server at https://c6409.ambari.apache.org:55659/ws/v1/slider/agents/
INFO 2014-08-18 16:46:59,526 NetUtil.py:38 - Connecting to the following url 
https://c6409.ambari.apache.org:55659/ws/v1/slider/agents/
ERROR 2014-08-18 16:47:00,013 NetUtil.py:52 - [Errno 8] _ssl.c:490: EOF 
occurred in violation of protocol
ERROR 2014-08-18 16:47:00,014 NetUtil.py:54 - SSLError: Failed to connect. 
Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
INFO 2014-08-18 16:47:00,014 NetUtil.py:76 - Server at 
https://c6409.ambari.apache.org:55659/ws/v1/slider/agents/ is not reachable, 
sleeping for 10 seconds...
INFO 2014-08-18 16:47:10,024 NetUtil.py:38 - Connecting to the following url 
https://c6409.ambari.apache.org:55659/ws/v1/slider/agents/
ERROR 2014-08-18 16:47:10,310 NetUtil.py:52 - [Errno 8] _ssl.c:490: EOF 
occurred in violation of protocol
ERROR 2014-08-18 16:47:10,310 NetUtil.py:54 - SSLError: Failed to connect. 
Please check openssl library versions.
Refer to: https://bugzilla.redhat.com/show_bug.cgi?id=1022468 for more details.
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to