https://github.com/systemd/systemd-stable/pull/111 has not landed in a v249 point release yet. v249.3 was tagged 12 days ago; and that fix was only merged 8 days ago.
On Wed, Aug 18, 2021 at 8:56 AM Amish <anon.am...@gmail.com> wrote: > Hello > > Further to my previous email: > > I see that there is already an *extremely similar issue* reported on July > 12, 2021 and it has been fixed. > > https://github.com/systemd/systemd/issues/20203 > > But I do not know if this fix exists in systemd v249.3 (Arch Linux) > > If it exists that means that fix is breaking my system. > And if it does not exist that means, I can expect it to fix my issue. > > Regards, > > Amish. > On 18/08/21 11:42 am, Amish wrote: > > Hello, > > Thank you for your reply. > > I can understand that there can be race. > > *But when I check logs, there is no race happening*. > > *Let us see and analyze the logs.* > > Stage 1: > System boots, and kernel assigns eth0, eth1 and eth2 as interface names. > > Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: (PCI > Express:2.5GT/s:Width x1) e0:d5:5e:8d:7f:2f > Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 > Network Connection > Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA > No: FFFFFF-0FF > Aug 18 09:17:13 kk kernel: 8139too 0000:04:00.0 eth1: RealTek RTL8139 at > 0x000000000e8fc9bb, 00:e0:4d:05:ee:a2, IRQ 19 > Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 eth2: RTL8168e/8111e, > 50:3e:aa:05:2b:ca, XID 2c2, IRQ 129 > Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 eth2: jumbo features > [frames: 9194 bytes, tx checksumming: ko] > > Stage 2: > Now udev rules are triggered and the interfaces are renamed to tmpeth0, > tmpeth2 and tmpeth1. > > Aug 18 09:17:13 kk kernel: 8139too 0000:04:00.0 tmpeth2: renamed from eth1 > Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 tmpeth0: renamed from eth0 > Aug 18 09:17:13 kk kernel: r8169 0000:02:00.0 tmpeth1: renamed from eth2 > > Stage 3: > Now my script is called and it renames interfaces to eth0, eth2 and eth1. > > Aug 18 09:17:13 kk kernel: e1000e 0000:00:1f.6 eth0: renamed from tmpeth0 > Aug 18 09:17:14 kk kernel: r8169 0000:02:00.0 eth1: renamed from tmpeth1 > Aug 18 09:17:14 kk kernel: 8139too 0000:04:00.0 eth2: renamed from tmpeth2 > > Effectively original interface eth1 and eth2 are swapped. While eth0 > remains eth0. > > All these happened before systemd-networkd started and interface renaming > was over by 9:17:14. > > Stage 4: > Now systemd-networkd starts, 2 seconds after all interface have been > assigned their final names. > > Aug 18 09:17:16 kk systemd[1]: Starting Network Configuration... > Aug 18 09:17:17 kk systemd-networkd[426]: lo: Link UP > Aug 18 09:17:17 kk systemd-networkd[426]: lo: Gained carrier > Aug 18 09:17:17 kk systemd-networkd[426]: Enumeration completed > Aug 18 09:17:17 kk systemd[1]: Started Network Configuration. > Aug 18 09:17:17 kk systemd-networkd[426]: eth2: Interface name change > detected, renamed to eth1. > Aug 18 09:17:17 kk systemd-networkd[426]: Could not process link message: > File exists > Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Failed > Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Interface name change > detected, renamed to eth2. > Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Interface name change > detected, renamed to tmpeth2. > Aug 18 09:17:17 kk systemd-networkd[426]: eth0: Interface name change > detected, renamed to tmpeth0. > Aug 18 09:17:17 kk systemd-networkd[426]: eth2: Interface name change > detected, renamed to tmpeth1. > Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth0: Interface name change > detected, renamed to eth0. > Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth1: Interface name change > detected, renamed to eth1. > Aug 18 09:17:17 kk systemd-networkd[426]: tmpeth2: Interface name change > detected, renamed to eth2. > Aug 18 09:17:17 kk systemd-networkd[426]: eth1: Link UP > Aug 18 09:17:17 kk systemd-networkd[426]: eth0: Link UP > Aug 18 09:17:20 kk systemd-networkd[426]: eth0: Gained carrier > > This is when eth0 and eth1 interfaces are up and configured by > systemd-networkd but eth2 is down and not configured. > > > > *None of the .network configuration files match by interface names. They > all match just by MAC address. *# sample .network file. > > [Match] > MACAddress=e0:d5:5e:8d:7f:2f > Type=ether > > [Network] > IgnoreCarrierLoss=yes > LinkLocalAddressing=no > IPv6AcceptRA=no > ConfigureWithoutCarrier=true > Address=192.168.25.2/24 > > Above error message "eth1: failed", was not showing earlier version of > systemd. > > So recent version of systemd-networkd is doing something different and > this is where something is going wrong. > > Stage 5: (my workaround for this issue) > I wrote a new service file which restarts systemd-networkd after waiting > for 10 seconds. > > Aug 18 09:17:27 kk systemd[1]: Stopping Network Configuration... > Aug 18 09:17:27 kk systemd[1]: systemd-networkd.service: Deactivated > successfully. > Aug 18 09:17:27 kk systemd[1]: Stopped Network Configuration. > Aug 18 09:17:27 kk systemd[1]: Starting Network Configuration... > Aug 18 09:17:27 kk systemd-networkd[579]: eth1: Link UP > Aug 18 09:17:27 kk systemd-networkd[579]: eth0: Link UP > Aug 18 09:17:27 kk systemd-networkd[579]: eth0: Gained carrier > Aug 18 09:17:27 kk systemd-networkd[579]: lo: Link UP > Aug 18 09:17:27 kk systemd-networkd[579]: lo: Gained carrier > Aug 18 09:17:27 kk systemd-networkd[579]: Enumeration completed > Aug 18 09:17:27 kk systemd[1]: Started Network Configuration. > Aug 18 09:17:27 kk systemd-networkd[579]: eth2: Link UP > Aug 18 09:17:27 kk systemd-networkd[579]: eth2: Gained carrier > > All interfaces are now up and running as expected. > > Please check as I do not believe that this issue is causing any race but > to me it looks like some logical change in systemd-networkd which is > causing the issue. > > Thank you and regards, > > Amish > > On 17/08/21 3:18 pm, Colin Guthrie wrote: > > Hiya, > > As has been said, this is racy. "Sufficiently early" is just a hope, > rather than a guarantee. Perhaps something in the kernel made things more > or less efficient (try booting with the old kernel to see if it helps, but > as this is a race, it may only work some of the time.). Or perhaps some > unit ordering changed so make this better? Perhaps udev settle units have > now been dropped and thus the boot is faster and things happen in a more > hotplug oriented way? Lot's of possibilities for why this no longer works > (and even before it definitely wasn't a guaranteed or recommended > approach). > > As has been said, you're best to pick a different "namespace" lan0 wan0 > wan1 etc. if you can but if you can't change this due to some legacy > scripts, at least pick sufficiently high ethN numbers to stay out of the > way of the kernel, e.g. if you have three eth cards, then pick your names > starting from e.g. 5: eth5, eth6, eth7 and thus you can avoid this dance > with temporary names (although I'd still recommend using different names > altogether if you can). > > Hope that helps. > > Col > > Amish wrote on 16/08/2021 13:38: > > > On 16/08/21 5:39 pm, Lennart Poettering wrote: > > On Mo, 16.08.21 17:31, Amish (anon.am...@gmail.com) wrote: > > On 16/08/21 5:25 pm, Lennart Poettering wrote: > > On Mo, 16.08.21 16:09, Amish (anon.am...@gmail.com) wrote: > > Some old scripts that we have expect interface names starting with eth. > But > those names are not predictable. > > So to get predictable names starting with eth*, first I temporarily rename > all interface with tmpeth*. This is done via udev rules. > > SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:XX", > NAME="tmpeth0" > SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:YY", > NAME="tmpeth1" > SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:ZZ", > NAME="tmpeth2" > > Then I have a small service (script) which runs before network-pre.target > to > convert these names back to eth* > > #search for network interface with name starting from "tmpeth" and rename > them to "eth" > /usr/bin/find /sys/class/net -maxdepth 1 -name "tmpeth[0-9]" -type l > -printf > "%f\n" | while read tmpiface; do /usr/bin/ip link set dev "$tmpiface" name > "$(echo $tmpiface | sed s/tmpeth/eth/)"; done > > This ensures that I have predictable names starting with eth*. And it is > working fine from 2-3 years. Even with current issue, name assignment is > working fine. > > This cannot work and is necesarily race. Stay out of the ethXYZ > namespace, that's the kernel's namespace. Pick any other names, > i.e. "foobar0", "foobar1", but otherwise you just have a racy racy > mess, because the kernel might take the name whenever it pleases. > > No I dont think this is race. Because my script runs after Udev has > finished > assigning the interfaces names. > > device probing can take any time it wants. there isn't a point in time > where everything is probed. > > > These are internal PCI LAN cards. I believe these gets probed (and named) > sufficiently early. > > And then we can expect names assigned by Udev to remain same. > > And I can see in the logs that names are not changed after my script runs. > > Also this has been working successfully for me from 2 or more years. > > But after today's update, something is breaking all the systems. > > Additionally just now on other system I see eth2 (instead of eth1) being > renamed to eth0. > > I just want to know what changed and where? (Kernel or Systemd?). > > *Also another point is, I have set ConfigureWithoutCarrier=yes in network > files and all are static IPs, so systemd-networkd should have configured > the devices even if links are not up. But its not doing that anymore either > after today's update.* > > Regards > > Amish. > > Lennart > > -- > Lennart Poettering, Berlin > > > >