[
https://issues.apache.org/jira/browse/MESOS-7406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
dean chen updated MESOS-7406:
-----------------------------
Description:
I setup a mesos cluster with zookeeper, 3 master nodes and 12 agents nodes.
There are three nodes on which running master and agent processes. When run c++
test framwork, it usually failed, but somtimes the four tasks can finished
successfully.
In one agent node, I see stderr log below:
{qutoe}
I0420 22:49:05.532886 17577 exec.cpp:162] Version: 1.2.0 I0420
22:49:05.556433 17582 exec.cpp:237] Executor registered on agent
3f442a52-2e2f-4799-8c1a-3a05b27120b9-S1 I0420 22:49:05.811959 17579
exec.cpp:415] Executor asked to shutdown
{qutoe}
Also there are many connection closed logs in master node:
E0420 22:48:51.964684 22212 process.cpp:2426] Failed to shutdown socket
with fd 30: Transport endpoint is not connected E0420 22:48:54.974160 22212
process.cpp:2426] Failed to shutdown socket with fd 30: Transport endpoint is
not connected E0420 22:48:56.192914 22212 process.cpp:2426] Failed to shutdown
socket with fd 27: Transport endpoint is not connected E0420 22:48:57.999858
22212 process.cpp:2426] Failed to shutdown socket with fd 30: Transport
endpoint is not connected E0420 22:49:00.994969 22212 process.cpp:2426] Failed
to shutdown socket with fd 30: Transport endpoint is not connected E0420
22:49:03.994499 22212 process.cpp:2426] Failed to shutdown socket with fd 30:
Transport endpoint is not connected E0420 22:49:05.999225 22212
process.cpp:2426] Failed to shutdown socket with fd 30: Transport endpoint is
not connected E0420 22:49:11.194205 22212 process.cpp:2426] Failed to shutdown
socket with fd 27: Transport endpoint is not connected E0420 22:49:26.196691
22212 process.cpp:2426] Failed to shutdown socket with fd 27: Transport
endpoint is not connected E0420 22:49:41.198381 22212 process.cpp:2426] Failed
to shutdown socket with fd 27: Transport endpoint is not connected
It seems the master try to close a connection between itself and one agent, but
the connection was closed. I guess that.
How to fix this or find the real reason of this problem.
was:
I setup a mesos cluster with zookeeper, 3 master nodes and 12 agents nodes.
There are three nodes on which running master and agent processes. When run c++
test framwork, it usually failed, but somtimes the four tasks can finished
successfully.
In one agent node, I see stderr log below:
I0420 22:49:05.532886 17577 exec.cpp:162] Version: 1.2.0 I0420
22:49:05.556433 17582 exec.cpp:237] Executor registered on agent
3f442a52-2e2f-4799-8c1a-3a05b27120b9-S1 I0420 22:49:05.811959 17579
exec.cpp:415] Executor asked to shutdown
Also there are many connection closed logs in master node:
E0420 22:48:51.964684 22212 process.cpp:2426] Failed to shutdown socket
with fd 30: Transport endpoint is not connected E0420 22:48:54.974160 22212
process.cpp:2426] Failed to shutdown socket with fd 30: Transport endpoint is
not connected E0420 22:48:56.192914 22212 process.cpp:2426] Failed to shutdown
socket with fd 27: Transport endpoint is not connected E0420 22:48:57.999858
22212 process.cpp:2426] Failed to shutdown socket with fd 30: Transport
endpoint is not connected E0420 22:49:00.994969 22212 process.cpp:2426] Failed
to shutdown socket with fd 30: Transport endpoint is not connected E0420
22:49:03.994499 22212 process.cpp:2426] Failed to shutdown socket with fd 30:
Transport endpoint is not connected E0420 22:49:05.999225 22212
process.cpp:2426] Failed to shutdown socket with fd 30: Transport endpoint is
not connected E0420 22:49:11.194205 22212 process.cpp:2426] Failed to shutdown
socket with fd 27: Transport endpoint is not connected E0420 22:49:26.196691
22212 process.cpp:2426] Failed to shutdown socket with fd 27: Transport
endpoint is not connected E0420 22:49:41.198381 22212 process.cpp:2426] Failed
to shutdown socket with fd 27: Transport endpoint is not connected
It seems the master try to close a connection between itself and one agent, but
the connection was closed. I guess that.
How to fix this or find the real reason of this problem.
> c++ test framwork failed somtimes but succeeded somtimes too
> ------------------------------------------------------------
>
> Key: MESOS-7406
> URL: https://issues.apache.org/jira/browse/MESOS-7406
> Project: Mesos
> Issue Type: Bug
> Components: framework
> Affects Versions: 1.2.0
> Environment: CentOS 7.2
> 12 VM machines, 8GB mem, 4 CPU cores
> Reporter: dean chen
>
> I setup a mesos cluster with zookeeper, 3 master nodes and 12 agents nodes.
> There are three nodes on which running master and agent processes. When run
> c++ test framwork, it usually failed, but somtimes the four tasks can
> finished successfully.
> In one agent node, I see stderr log below:
> {qutoe}
> I0420 22:49:05.532886 17577 exec.cpp:162] Version: 1.2.0 I0420
> 22:49:05.556433 17582 exec.cpp:237] Executor registered on agent
> 3f442a52-2e2f-4799-8c1a-3a05b27120b9-S1 I0420 22:49:05.811959 17579
> exec.cpp:415] Executor asked to shutdown
> {qutoe}
> Also there are many connection closed logs in master node:
> E0420 22:48:51.964684 22212 process.cpp:2426] Failed to shutdown socket
> with fd 30: Transport endpoint is not connected E0420 22:48:54.974160 22212
> process.cpp:2426] Failed to shutdown socket with fd 30: Transport endpoint is
> not connected E0420 22:48:56.192914 22212 process.cpp:2426] Failed to
> shutdown socket with fd 27: Transport endpoint is not connected E0420
> 22:48:57.999858 22212 process.cpp:2426] Failed to shutdown socket with fd 30:
> Transport endpoint is not connected E0420 22:49:00.994969 22212
> process.cpp:2426] Failed to shutdown socket with fd 30: Transport endpoint is
> not connected E0420 22:49:03.994499 22212 process.cpp:2426] Failed to
> shutdown socket with fd 30: Transport endpoint is not connected E0420
> 22:49:05.999225 22212 process.cpp:2426] Failed to shutdown socket with fd 30:
> Transport endpoint is not connected E0420 22:49:11.194205 22212
> process.cpp:2426] Failed to shutdown socket with fd 27: Transport endpoint is
> not connected E0420 22:49:26.196691 22212 process.cpp:2426] Failed to
> shutdown socket with fd 27: Transport endpoint is not connected E0420
> 22:49:41.198381 22212 process.cpp:2426] Failed to shutdown socket with fd 27:
> Transport endpoint is not connected
> It seems the master try to close a connection between itself and one agent,
> but the connection was closed. I guess that.
> How to fix this or find the real reason of this problem.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)