Ciao David,
msk is the first port added to the trunk? ie, it's the preferred port? if you run tcpdump on msk or watch systat if, do you see packets on msk?
The network config is pretty standard; an Ethernet port (msk0), a wifi one (iwn0), trunk0 with failover (using msk0 as "preferred" port):
$ cat /etc/hostname.trunk0 trunkproto failovertrunkport msk0 trunkport iwn0 autoconf up
Bear with me, tcpdump is a kind of stranger world for me... Enclosed please find the output files from the following commands:
$ doas tcpdump -i trunk0 -c 50 -w trunk0.dump $ doas tcpdump -i msk0 -c 50 -w msk0.dump $ doas tcpdump -i iwn0 -c 50 -w iwn0.dump
I see some "broadcast" packages on both trunk0 and msk0 (trunk0 didn't receive the inet address from the DHCP server, of course); nothing as expected on iwn0.
Hope this answers to your question... Cheers
On 28 Dec 2021, at 20:41, Alessandro De Laurenzis <[email protected]> wrote: Hello David, On 28/12/2021 08:39, [email protected] wrote:Hi Alessandro, Did you bisect the whole kernel or just msk(4) changes?I made repeated installations from scratch, starting from 5.3 till 7.0, so I bisected the whole kernel. But then, using the 7.0-stable sources, I reverted if_mskvar.h to 1.13 and if_msk.c to 1.130, applying to the latter all the modifications subsequent to 1.131 (in order to make the code usable, since there have been a few API changes meanwhile), and verified that trunk(4) i/f is still fully functional, so I think the issue is in the current if_msk.c code. Hope this helps. All the bestOn 8 Dec 2021, at 04:39, Alessandro De Laurenzis <[email protected]> wrote: Greetings, I recently installed OpenBSD 7.0 on an old CoreDuo2 machine (Compaq 610, complete dmesg in attach), which was powered by 5.5/5.6/5.7 some years ago, without any relevant issues (after that, it has been used as home server with Debian).mskc0 at pci4 dev 0 function 0 "Marvell Yukon 88E8042" rev 0x10, Yukon-2 FE+ rev. A0 (0x0): msi msk0 at mskc0 port A: address 18:a9:05:94:ab:19 eephy0 at msk0 phy 0: 88E3016 10/100 PHY, rev. 0I noticed that the trunk(4) failover protocol is broken when the Ethernet cable is plugged in (starting in this configuration, no lease is acquired from DHCP server, switching to Ethernet from wifi breaks the connection; in both cases, trunk and msk0 status is: no carrier). It's worth noting that when msk0 is configured as "stand-alone" (i.e., without trunk(4) failover), the connection is pretty functional and stable. Since I didn't remember any similar problems showing up with 5.x, I made a bit of bisecting, and my conclusion is that the functionality got broken b/w 6.2 and 6.3 and, specifically, after the following commit:RCS file: /cvs/src/sys/dev/pci/if_msk.c,v ---------------------------- revision 1.131 date: 2018/01/06 03:11:04; author: dlg; state: Exp; lines: +251 -311; commitid: BhB8LisF92o4xfOK; rework the transmit and receive paths to address reliability issues. phessler@ has been having trouble with msk on overdrive 1000s. some of the issues relate to the driver not coping with exhaustion of mbufs for the rx ring, the other issues are corruption of the mcl9k pool that msk uses. this diff adds a timeout that the rx refill code uses when the rx ring is empty and cannot be filled. it'll periodically retry the ring refill until it can get some mbufs in the air again. the current code made hunting for the mcl9k issue too hard, so this rewrites it to be simpler and more like other drivers. there's now just arrays of mbuf pointers and dmamaps to shadow the hardware ring entries, and producer and consumer indexes. what was there before had linkes lists of something to hold mbuf pointers and dmamaps, and some way to go from the ring to go back to that. i think, it was hard to tell what was happening. this also copies the ADDR64 handling on the tx ring to the rx ring. this potentially makes more rx descriptors available, but that can happen later. in hindsight the mcl9k problem could have been from letting if_rxr allocate the entier ring. if every descriptor was filled, the chip may have run around the ring when it shouldnt have. giving rxr one less descriptor than there is on the ring may have fixed the problem too. this work also makes it easier to make msk mpsafe. tested by an ok phessler@ ok kettenis@ deraadt@ =============================================================================and the corresponding one for sys/dev/pci/if_mskvar.h (revision 1.14, same log). On a fresh 6.3 install, which was showing the issue, I reverted the 2 files to the revisions 1.130 and 1.13 respectively, observing a functional trunk(4) failover again. The diff is too long and complex, so I cannot say where the problem lies exactly, but I hope this report contains enough information to start an analysis (I'm copying the involved developers, just in case they are not reading this list); of course, I'm available to test any patches (on 7.0 or -current) and add further details if needed. Please note that the dmesg is from OBSD 6.3, since that is the version currently installed on the laptop; in case you're interested in the 7.0/current's dmesg, just let me know. All the best -- Alessandro De Laurenzis [mailto:[email protected]] Web: http://www.atlantide.mooo.com LinkedIn: http://it.linkedin.com/in/delaurenzis<dmesg.txt>-- Alessandro De Laurenzis [mailto:[email protected]] Web: http://www.atlantide.mooo.com LinkedIn: http://it.linkedin.com/in/delaurenzis
-- Alessandro De Laurenzis [mailto:[email protected]] Web: http://www.atlantide.mooo.com LinkedIn: http://it.linkedin.com/in/delaurenzis
trunk0.dump
Description: Binary data
msk0.dump
Description: Binary data
iwn0.dump
Description: Binary data
