[ 
https://issues.apache.org/jira/browse/MESOS-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204495#comment-17204495
 ] 

Jaya Prasad Reddy G commented on MESOS-10186:
---------------------------------------------

We upgraded the libevent library to 2.1.8 and the stack trace is gone. Now 
we're seeing the following error when agent tries to connect to master. 
{code:java}
Master logs
W0929 17:29:26.677386  5305 process.cpp:902] Failed to accept socket: Failed 
accept: connection error: error:1408F09C:SSL routines:ssl3_get_record:http 
request
W0929 17:29:29.677179  5305 process.cpp:902] Failed to accept socket: Failed 
accept: connection error: error:1408F09C:SSL routines:ssl3_get_record:http 
request
{code}
{code:java}
Agent logs
I0930 06:51:23.400292 16911 slave.cpp:1367] Authenticating with master 
master@<master-ip>:5050 
I0930 06:51:23.401051 16911 slave.cpp:1376] Using default CRAM-MD5 
authenticatee 
I0930 06:51:23.401329 16902 authenticatee.cpp:97] Initializing client SASL 
I0930 06:51:23.401408 16902 authenticatee.cpp:121] Creating new client SASL 
connection 
W0930 06:51:30.180505 16907 slave.cpp:1435] Failed to authenticate with master 
master@<master-ip>:5050: future discarded 
I0930 06:51:30.180548 16907 slave.cpp:1367] Authenticating with master 
master@<master-ip>:5050 
I0930 06:51:30.180569 16907 slave.cpp:1376] Using default CRAM-MD5 
authenticatee 
I0930 06:51:30.180663 16912 authenticatee.cpp:121] Creating new client SASL 
connection 
W0930 06:51:30.180791 16914 slave.cpp:1405] Authentication timed out 
W0930 06:51:37.228804 16914 slave.cpp:1405] Authentication timed out 
W0930 06:51:37.228852 16909 slave.cpp:1435] Failed to authenticate with master 
master@<master-ip>:5050: future discarded 
I0930 06:51:37.229040 16909 slave.cpp:1367] Authenticating with master 
master@<master-ip>:5050 
I0930 06:51:37.229070 16909 slave.cpp:1376] Using default CRAM-MD5 
authenticatee 
I0930 06:51:37.229140 16899 authenticatee.cpp:121] Creating new client SASL 
connection
 
{code}
We've been using the certs which are signed by the same CA and the passwords on 
both agent and master match. So we're unable to figure out why the connections 
are failing now.

> Segmentation fault while running mesos in SSL mode
> --------------------------------------------------
>
>                 Key: MESOS-10186
>                 URL: https://issues.apache.org/jira/browse/MESOS-10186
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 1.7.0, 1.9.0
>            Reporter: Jaya Prasad Reddy G
>            Priority: Blocker
>
> Hello,
> I've been runnning into segmentation faults while running mesos in SSL mode.
> After backtracing the coredump, this is what I found:
> {code:java}
> #0  0x00007f93060b7592 in free () from /lib/x86_64-linux-gnu/libc.so.6
> #1  0x00007f9303ebe8cd in CRYPTO_free () from 
> /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
> #2  0x00007f9303f7bfde in EVP_CIPHER_CTX_cleanup () from 
> /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
> #3  0x00007f93042e4275 in ?? () from /lib/x86_64-linux-gnu/libssl.so.1.0.0
> #4  0x00007f93042e59ca in SSL_set_accept_state () from 
> /lib/x86_64-linux-gnu/libssl.so.1.0.0
> #5  0x00007f93059a5c72 in ?? () from 
> /usr/lib/x86_64-linux-gnu/libevent_openssl-2.0.so.5
> #6  0x00007f930a801a74 in 
> process::network::internal::LibeventSSLSocketImpl::accept_SSL_callback 
> (request=request@entry=0x7f92f8000a80) at 
> src/posix/libevent/libevent_ssl_socket.cpp:1172
> #7  0x00007f930a8021ea in 
> process::network::internal::LibeventSSLSocketImpl::accept_callback 
> (this=this@entry=0x55569161b430, request=request@entry=0x7f92f8000a80) at 
> src/posix/libevent/libevent_ssl_socket.cpp:1124
> #8  0x00007f930a8027bc in 
> process::network::internal::LibeventSSLSocketImpl::<lambda(evconnlistener*, 
> int, sockaddr*, int, void*)>::operator()(evconnlistener *, int, sockaddr *, 
> void *, int) (listener=0x555691667980, socket=24, 
>     addr=<optimized out>, arg=0x55569161be30, addr_length=<optimized out>, 
> __closure=<optimized out>) at src/posix/libevent/libevent_ssl_socket.cpp:988
> #9  0x00007f9305e0829c in ?? () from 
> /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
> #10 0x00007f9305dfa639 in event_base_loop () from 
> /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
> #11 0x00007f930a81c41d in process::EventLoop::run () at 
> src/posix/libevent/libevent.cpp:98
> #12 0x00007f93066cbd00 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> #13 0x00007f930699c6ba in start_thread () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #14 0x00007f930613a60d in clone () from /lib/x86_64-linux-gnu/libc.so.6
> {code}
> The environment variables exported before running mesos are:
> {code:java}
> MESOS_SSL_CIPHERS=ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA
>  
> LIBPROCESS_SSL_ENABLED=true 
> LIBPROCESS_SSL_SUPPORT_DOWNGRADE=false
> LIBPROCESS_SSL_CA_DIR=/opt/mesos/secrets/certs/
> LIBPROCESS_SSL_KEY_FILE=/opt/mesos/secrets/certs/mesos.private_key 
> LIBPROCESS_SSL_CERT_FILE=/opt/mesos/secrets/certs/mesos.certificate 
> LIBPROCESS_SSL_VERIFY_CERT=false
> LIBPROCESS_SSL_REQUIRE_CERT=false 
> LIBPROCESS_SSL_VERIFY_IPADD=false 
> LIBPROCESS_SSL_CA_FILE=/etc/ssl_ca/ca_list.pem 
> LIBPROCESS_SSL_CIPHERS=ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA
>  
> LIBPROCESS_SSL_ENABLE_SSL_V3=false 
> LIBPROCESS_SSL_ENABLE_TLS_V1_0=false 
> LIBPROCESS_SSL_ENABLE_TLS_V1_1=false 
> LIBPROCESS_SSL_ENABLE_TLS_V1_2=true 
> ZOOKEEPER_SSL_ENABLED=0 
> ZOOKEEPER_SSL_VERIFY_CERT=false
> ZOOKEEPER_SSL_REQUIRE_CERT=false 
> ZOOKEEPER_SSL_CA_FILE=/etc/ssl_ca/ca_list.pem 
> ZOOKEEPER_SSL_KEY_FILE=/opt/mesos/secrets/certs/mesos.private_key 
> ZOOKEEPER_SSL_CERT_FILE=/opt/mesos/secrets/certs/mesos.certificate 
> {code}
> And the command used to run mesos:
> {code:java}
> mesos_master --ip=172.24.51.99  --advertise_ip=172.24.51.99 
> --hostname_lookup=false --acls=file:///opt/mesos/etc/acl.json 
> --modules=file:///opt/mesos/etc/modules.json --port=5050 --quorum=3 
> --work_dir=/ghostcache2/mesos/registry 
> --zk=zk://mycluster1.random.cluster.com:2181,mycluster2.random.cluster.com:2181,mycluster3.random.cluster.com:2181,mycluster4.random.cluster.com:2181,mycluster5.random.cluster.com:2181/mesos.mycluster
>  --cluster=mycluster --hostname=mycluster2.random.cluster.com 
> --offer_timeout=30secs --webui_dir=/opt/mesos/share/mesos/webui 
> --whitelist=file:///opt/mesos/etc/whitelist --authenticate_agents=true 
> --authenticators=crammd5 --authorizers=local 
> --credentials=file:///opt/mesos/secrets/all_credentials.json 
> --authenticate_frameworks=true{code}
> lddtree of the mesos_master
> {code:java}
> ~# lddtree /a/sbin/mesos_master 
> mesos_master => /a/sbin/mesos_master (interpreter => 
> /lib64/ld-linux-x86-64.so.2)
>     libmesos-1.9.0.so => /usr/local/myproject/opt/mesos/lib/libmesos-1.9.0.so
>         libevent-2.0.so.5 => /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
>         libblkid.so.1 => /lib/x86_64-linux-gnu/libblkid.so.1
>         libevent_openssl-2.0.so.5 => 
> /usr/lib/x86_64-linux-gnu/libevent_openssl-2.0.so.5
>             libevent_core-2.0.so.5 => 
> /usr/lib/x86_64-linux-gnu/libevent_core-2.0.so.5
>             libssl.so.1.0.0 => /lib/x86_64-linux-gnu/libssl.so.1.0.0
>             libcrypto.so.1.0.0 => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0
>         libevent_pthreads-2.0.so.5 => 
> /usr/lib/x86_64-linux-gnu/libevent_pthreads-2.0.so.5
>         libuuid.so.1 => /lib/x86_64-linux-gnu/libuuid.so.1
>         libcrypto.so.111.0.0 => 
> /usr/local/myproject/lib/openssl/x86_64-linux-gnu/libcrypto.so.111.0.0
>         libssl.so.111.0.0 => 
> /usr/local/myproject/lib/openssl/x86_64-linux-gnu/libssl.so.111.0.0
>         libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2
>         librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1
>         libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6
>         ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2
>     libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
>     libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
>     libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
>     libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6{code}
>  
> I tried updating mesos(from 1.7.0 to 1.9.0), libevent(2.0.21 to 2.0.22, 2.1.8 
> and 2.1.12) to latest versions but still no luck. Not sure what I'm missing. 
> Any help is appreciated.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to