Dominic, This might be related to this: https://issues.apache.org/jira/browse/MESOS-7130
- Jie On Sun, Mar 19, 2017 at 10:10 AM, Dominic Grégoire < dominic.grego...@gmail.com> wrote: > Hello all, > > I’m testing with mesos 1.1.0 on aws linux to see if it applies to some of > our processes and I ran into a problem with network/port_mapping, maybe > this is a known issue? > > The agent is running with these flags: > export MESOS_isolation=cgroups/cpu,cgroups/mem,network/port_mapping > export MESOS_containerizers=mesos > export MESOS_resources="ports:[31000-32000];ephemeral_ports:[32768-57344]" > export MESOS_ephemeral_ports_per_container=1024 > > Running spark 2.1.0 with 2 mesos containers on the same host, they can > connect to each other’s block manager but can’t send traffic, it stays in > their netns send-q. > > Spark is logging: > 7/03/19 16:54:56 INFO TransportClientFactory: Successfully created > connection to ip-10-32-20-34.ec2.internal/10.32.20.34:34294 after 12 ms > (0 ms spent in bootstraps) > 17/03/19 16:56:56 ERROR TransportChannelHandler: Connection to > ip-10-32-20-34.ec2.internal/10.32.20.34:34294 has been quiet for 120000 > ms while there are outstanding requests. Assuming connection is dead; > please adjust spark.network.timeout if this is wrong. > > I can see connections established between containers but everything stays > in the send Qs: > [root@ip-10-32-20-34 sysctl.d]# ip netns > 4602 (id: 1) > 4600 (id: 0) > [root@ip-10-32-20-34 sysctl.d]# ip netns exec 4600 netstat -an > Connexions Internet actives (serveurs et établies) > Proto Recv-Q Send-Q Local Address Foreign Address > State > tcp 0 0 10.32.20.34:32861 0.0.0.0:* > LISTEN > tcp 0 0 0.0.0.0:33003 0.0.0.0:* > LISTEN > tcp 0 0 10.32.20.34:33003 10.32.20.34:57363 > ESTABLISHED > tcp 0 0 10.32.20.34:33566 10.32.20.34:34294 > ESTABLISHED > tcp 0 0 10.32.20.34:33658 10.32.18.185:40600 > ESTABLISHED > tcp 0 0 10.32.20.34:32832 10.32.18.185:40196 > ESTABLISHED > tcp 0 0 10.32.20.34:33406 10.32.20.34:5051 > ESTABLISHED > Sockets du domaine UNIX actives(serveurs et établies) > Proto RefCpt Indicatrs Type Etat I-Node Chemin > unix 2 [ ] STREAM CONNECTE 21869 > unix 2 [ ] STREAM CONNECTE 20339 > [root@ip-10-32-20-34 sysctl.d]# ip netns exec 4602 netstat -an > Connexions Internet actives (serveurs et établies) > Proto Recv-Q Send-Q Local Address Foreign Address > State > tcp 0 0 0.0.0.0:33836 0.0.0.0:* > LISTEN > tcp 0 0 10.32.20.34:34294 0.0.0.0:* > LISTEN > tcp 0 24229 10.32.20.34:34294 10.32.20.34:33566 > ESTABLISHED > tcp 0 0 10.32.20.34:33860 10.32.18.185:40196 > ESTABLISHED > tcp 0 0 10.32.20.34:34680 10.32.18.185:40600 > ESTABLISHED > tcp 0 0 10.32.20.34:34434 10.32.20.34:5051 > ESTABLISHED > tcp 0 0 10.32.20.34:33836 10.32.20.34:58149 > ESTABLISHED > Sockets du domaine UNIX actives(serveurs et établies) > Proto RefCpt Indicatrs Type Etat I-Node Chemin > unix 2 [ ] STREAM CONNECTE 20359 > unix 2 [ ] STREAM CONNECTE 20373 > [root@ip-10-32-20-34 sysctl.d]# >