[ 
https://issues.apache.org/jira/browse/MESOS-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879979#comment-16879979
 ] 

Qian Zhang commented on MESOS-9868:
-----------------------------------

172.31.10.35 is actually the IP of Mesos agent host, so I think we may run into 
[this 
code|https://github.com/apache/mesos/blob/1.8.0/src/slave/slave.cpp#L5783:L5803],
 i.e., CNI isolator did not report the container's IP somehow, so agent just 
filled in the container IP with the its own IP. Need to investigate why CNI did 
not report the container's IP.

> NetworkInfo from the agent /state endpoint is not correct.
> ----------------------------------------------------------
>
>                 Key: MESOS-9868
>                 URL: https://issues.apache.org/jira/browse/MESOS-9868
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 1.8.0
>            Reporter: Gilbert Song
>            Priority: Blocker
>              Labels: containerization
>
> NetworkInfo from the agent /state endpoint is not correct, which is also 
> different from the networkInfo of /containers endpoint. Some frameworks rely 
> on the state endpoint to get the ip address for other containers to run.
> agent's state endpoint
> {noformat}
> {
> "state": "TASK_RUNNING",
> "timestamp": 1561574343.1521769,
> "container_status": {
> "container_id": {
> "value": "9a2633be-d2e5-4636-9ad4-7b2fc669da99",
> "parent": {
> "value": "45ebab16-9b4b-416e-a7f2-4833fd4ed8ff"
> }
> },
> "network_infos": [
> {
> "ip_addresses": [
> {
> "protocol": "IPv4",
> "ip_address": "172.31.10.35"
> }
> ]
> }
> ]
> },
> "healthy": true
> }
> {noformat}
> agent's /containers endpoint
> {noformat}
> "status": {
> "container_id": {
> "value": "5ffc9df2-3be6-4879-8b2d-2fde3f0477e0"
> },
> "executor_pid": 16063,
> "network_infos": [
> {
> "ip_addresses": [
> {
> "ip_address": "9.0.35.71",
> "protocol": "IPv4"
> }
> ],
> "name": "dcos"
> }
> ]
> }
> {noformat}
> The ip addresses are different^^.
> The container is in RUNNING state and is running correctly. Just the state 
> endpoint is not correct. One thing to notice is that the state endpoint used 
> to show the correct IP. After there was an agent restart and master leader 
> re-election, the IP address in the state endpoint was changed.
> Here is the checkpoint CNI network information
> {noformat}
> OK-23:37:48-root@int-mountvolumeagent2-soak113s:/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S4/frameworks/26ffb84c-81ba-4b3b-989b-9c6560e51fa1-0171/executors/k8s-clusters.kc02__etcd__b50dc403-30d1-4b54-a367-332fb3621030/runs/latest/tasks/k8s-clusters.kc02__etcd-2-peer__5b6aa5fc-e113-4021-9db8-b63e0c8d1f6c
>  # cat 
> /var/run/mesos/isolators/network/cni/45ebab16-9b4b-416e-a7f2-4833fd4ed8ff/dcos/network.conf
>  
> {"args":{"org.apache.mesos":{"network_info":{"name":"dcos"}}},"chain":"M-DCOS","delegate":{"bridge":"m-dcos","hairpinMode":true,"ipMasq":false,"ipam":{"dataDir":"/var/run/dcos/cni/networks","routes":[{"dst":"0.0.0.0/0"}],"subnet":"9.0.73.0/25","type":"host-local"},"isGateway":true,"mtu":1420,"type":"bridge"},"excludeDevices":["m-dcos"],"name":"dcos","type":"mesos-cni-port-mapper"}
> {noformat}
> {noformat}
> OK-01:30:05-root@int-mountvolumeagent2-soak113s:/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S4/frameworks/26ffb84c-81ba-4b3b-989b-9c6560e51fa1-0171/executors/k8s-clusters.kc02__etcd__b50dc403-30d1-4b54-a367-332fb3621030/runs/latest/tasks/k8s-clusters.kc02__etcd-2-peer__5b6aa5fc-e113-4021-9db8-b63e0c8d1f6c
>  # cat 
> /var/run/mesos/isolators/network/cni/45eb16-9b4b-416e-a7f2-4833fd4ed8ff/dcos/eth0/network.info
> {"dns":{},"ip4":{"gateway":"9.0.73.1","ip":"9.0.73.65/25","routes":[{"dst":"0.0.0.0/0","gw":"9.0.73.1"}]}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to