[ https://issues.apache.org/jira/browse/MESOS-10243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866134#comment-17866134 ]
Benjamin Mahler commented on MESOS-10243: ----------------------------------------- Landed fix for host network namespace veth<pid> interface. Let's leave this open and mark as fixed once we also set the container network namespace eth0 interface's mac address on creation / update the script to stop setting it. > MAC Address changes from link::setMAC may not stick, leading to container > launch failure with port mapping isolator. > -------------------------------------------------------------------------------------------------------------------- > > Key: MESOS-10243 > URL: https://issues.apache.org/jira/browse/MESOS-10243 > Project: Mesos > Issue Type: Bug > Affects Versions: 1.11.0 > Reporter: Jason Zhou > Assignee: Jason Zhou > Priority: Major > > It seems that there are scenarios where mesos containers cannot communicate > with agents as the MAC addresses are set incorrectly, leading to dropped > packets. A workaround for this behavior is to check that the MAC address is > set correctly after the ioctl call, and retry the address setting if > necessary. > In our test, this workaround appears to reduce the frequency of this issue, > but does not seem to prevent all such failures. > Reviewboard ticket for the workaround: [https://reviews.apache.org/r/75057/] > Observed scenarios with incorrectly assigned MAC addresses: > 1. ioctl returns the correct MAC address, but not net::mac > 2. both net::mac and ioctl return the same MAC address, but are both wrong > 3. There are no cases where ioctl/net::mac come back with the same MAC > address as before setting. i.e. there is no no-op observed. > 4. There is a possibility that ioctl/net::mac results disagree with each > other even before attempting to set our desired MAC address. As such, we > check that the results agree before we set, and log a warning if we find > a mismatch > 5. There is a possibility that the MAC address we set ends up overwritten by > a garbage value after setMAC has already completed and checked that the > mac address was set correctly. Since this error happens after this > function has finished, we cannot log nor detect it in setMAC. Our > workaround cannot deal with this scenario as it occurs outside setMAC > Notes: > 1. We have observed this behavior only on CentOS 9 systems at the moment, > We have tried kernels 5.15.147, 5.15.160, 5.15.161, which all have this > issue. > CentOS 7 systems do not seem to have this issue with setMAC. -- This message was sent by Atlassian Jira (v8.20.10#820010)