[ 
https://issues.apache.org/jira/browse/MESOS-10243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866134#comment-17866134
 ] 

Benjamin Mahler commented on MESOS-10243:
-----------------------------------------

Landed fix for host network namespace veth<pid> interface.

Let's leave this open and mark as fixed once we also set the container network 
namespace eth0 interface's mac address on creation / update the script to stop 
setting it.

> MAC Address changes from link::setMAC may not stick, leading to container 
> launch failure with port mapping isolator.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-10243
>                 URL: https://issues.apache.org/jira/browse/MESOS-10243
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 1.11.0
>            Reporter: Jason Zhou
>            Assignee: Jason Zhou
>            Priority: Major
>
> It seems that there are scenarios where mesos containers cannot communicate 
> with agents as the MAC addresses are set incorrectly, leading to dropped 
> packets. A workaround for this behavior is to check that the MAC address is 
> set correctly after the ioctl call, and retry the address setting if 
> necessary.
> In our test, this workaround appears to reduce the frequency of this issue, 
> but does not seem to prevent all such failures.
> Reviewboard ticket for the workaround: [https://reviews.apache.org/r/75057/] 
> Observed scenarios with incorrectly assigned MAC addresses:
> 1. ioctl returns the correct MAC address, but not net::mac
> 2. both net::mac and ioctl return the same MAC address, but are both wrong
> 3. There are no cases where ioctl/net::mac come back with the same MAC
>    address as before setting. i.e. there is no no-op observed.
> 4. There is a possibility that ioctl/net::mac results disagree with each
>    other even before attempting to set our desired MAC address. As such, we
>    check that the results agree before we set, and log a warning if we find
>    a mismatch
> 5. There is a possibility that the MAC address we set ends up overwritten by
>    a garbage value after setMAC has already completed and checked that the
>    mac address was set correctly. Since this error happens after this
>    function has finished, we cannot log nor detect it in setMAC. Our 
> workaround cannot     deal with this scenario as it occurs outside setMAC
> Notes:
> 1. We have observed this behavior only on CentOS 9 systems at the moment,
>    We have tried kernels 5.15.147, 5.15.160, 5.15.161, which all have this
>    issue.
>    CentOS 7 systems do not seem to have this issue with setMAC.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to