Bhuvan Arumugam created MESOS-2043:
--------------------------------------
Summary: framework auth fail with timeout error and never get
authenticated
Key: MESOS-2043
URL: https://issues.apache.org/jira/browse/MESOS-2043
Project: Mesos
Issue Type: Bug
Components: master
Affects Versions: 0.21.0
Reporter: Bhuvan Arumugam
I'm facing this issue in master as of
https://github.com/apache/mesos/commit/74ea59e144d131814c66972fb0cc14784d3503d4
As [~adam-mesos] mentioned in IRC, this sounds similar to MESOS-1866. I'm
running 1 master and 1 scheduler (aurora). The framework authentication fail
due to time out:
error on mesos master:
{code}
I1104 19:37:17.741449 8329 master.cpp:3874] Authenticating
scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083
I1104 19:37:17.741585 8329 master.cpp:3885] Using default CRAM-MD5
authenticator
I1104 19:37:17.742106 8336 authenticator.hpp:169] Creating new server SASL
connection
W1104 19:37:22.742959 8329 master.cpp:3953] Authentication timed out
W1104 19:37:22.743548 8329 master.cpp:3930] Failed to authenticate
scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083:
Authentication discarded
{code}
scheduler error:
{code}
I1104 19:38:57.885486 49012 sched.cpp:283] Authenticating with master
master@MASTER_IP:PORT
I1104 19:38:57.885928 49002 authenticatee.hpp:133] Creating new client SASL
connection
I1104 19:38:57.890581 49007 authenticatee.hpp:224] Received SASL authentication
mechanisms: CRAM-MD5
I1104 19:38:57.890656 49007 authenticatee.hpp:250] Attempting to authenticate
with mechanism 'CRAM-MD5'
W1104 19:39:02.891196 49005 sched.cpp:378] Authentication timed out
I1104 19:39:02.891850 49018 sched.cpp:338] Failed to authenticate with master
master@MASTER_IP:PORT: Authentication discarded
{code}
Looks like 2 instances {{scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94}} &
{{scheduler-d2d4437b-d375-4467-a583-362152fe065a}} of same framework is trying
to authenticate and fail.
{code}
W1104 19:36:30.769420 8319 master.cpp:3930] Failed to authenticate
scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94@SCHEDULER_IP:8083: Failed to
communicate with authenticatee
I1104 19:36:42.701441 8328 master.cpp:3860] Queuing up authentication request
from scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 because
authentication is still in progress
{code}
Restarting master and scheduler didn't fix it.
This particular issue happen with 1 master and 1 scheduler after MESOS-1866 is
fixed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)