On Tue, 1 Jul 2025, 14.46 Lifshits, Vitaly, <[email protected]> wrote:
> On 7/1/2025 8:31 AM, En-Wei WU wrote: > > Hi, > > > > I'm seeing a regression on an HP ZBook using the e1000e driver > > (chipset PCI ID: [8086:57a0]) -- the system can't get an IP address > > after hot-plugging an Ethernet cable. In this case, the Ethernet cable > > was unplugged at boot. The network interface eno1 was present but > > stuck in the DHCP process. Using tcpdump, only TX packets were visible > > and never got any RX -- indicating a possible packet loss or > > link-layer issue. > > > > This is on the vanilla Linux 6.16-rc4 (commit > > 62f224733431dbd564c4fe800d4b67a0cf92ed10). > > > > Bisect says it's this commit: > > > > commit efaaf344bc2917cbfa5997633bc18a05d3aed27f > > Author: Vitaly Lifshits <[email protected]> > > Date: Thu Mar 13 16:05:56 2025 +0200 > > > > e1000e: change k1 configuration on MTP and later platforms > > > > Starting from Meteor Lake, the Kumeran interface between the > integrated > > MAC and the I219 PHY works at a different frequency. This causes > sporadic > > MDI errors when accessing the PHY, and in rare circumstances could > lead > > to packet corruption. > > > > To overcome this, introduce minor changes to the Kumeran idle > > state (K1) parameters during device initialization. Hardware reset > > reverts this configuration, therefore it needs to be applied in a > few > > places. > > > > Fixes: cc23f4f0b6b9 ("e1000e: Add support for Meteor Lake") > > Signed-off-by: Vitaly Lifshits <[email protected]> > > Tested-by: Avigail Dahan <[email protected]> > > Signed-off-by: Tony Nguyen <[email protected]> > > > > drivers/net/ethernet/intel/e1000e/defines.h | 3 +++ > > drivers/net/ethernet/intel/e1000e/ich8lan.c | 80 > > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----- > > drivers/net/ethernet/intel/e1000e/ich8lan.h | 4 ++++ > > 3 files changed, 82 insertions(+), 5 deletions(-) > > > > Reverting this patch resolves the issue. > > > > Based on the symptoms and the bisect result, this issue might be > > similar to > https://lore.kernel.org/intel-wired-lan/[email protected]/ > > > > > > Affected machine is: > > HP ZBook X G1i 16 inch Mobile Workstation PC, BIOS 01.02.03 05/27/2025 > > (see end of message for dmesg from boot) > > > > CPU model name: > > Intel(R) Core(TM) Ultra 7 265H (Arrow Lake) > > > > ethtool output: > > driver: e1000e > > version: 6.16.0-061600rc4-generic > > firmware-version: 0.1-4 > > expansion-rom-version: > > bus-info: 0000:00:1f.6 > > supports-statistics: yes > > supports-test: yes > > supports-eeprom-access: yes > > supports-register-dump: yes > > supports-priv-flags: yes > > > > lspci output: > > 0:1f.6 Ethernet controller [0200]: Intel Corporation Device [8086:57a0] > > DeviceName: Onboard Ethernet > > Subsystem: Hewlett-Packard Company Device [103c:8e1d] > > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- > > ParErr- Stepping- SERR- FastB2B- DisINTx+ > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > > <TAbort- <MAbort- >SERR- <PERR- INTx- > > Latency: 0 > > Interrupt: pin D routed to IRQ 162 > > IOMMU group: 17 > > Region 0: Memory at 92280000 (32-bit, non-prefetchable) > [size=128K] > > Capabilities: [c8] Power Management version 3 > > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA > > PME(D0+,D1-,D2-,D3hot+,D3cold+) > > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- > > Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ > > Address: 00000000fee00798 Data: 0000 > > Kernel driver in use: e1000e > > Kernel modules: e1000e > > > > The relevant dmesg: > > <<<cable disconnected>>> > > > > [ 0.927394] e1000e: Intel(R) PRO/1000 Network Driver > > [ 0.927398] e1000e: Copyright(c) 1999 - 2015 Intel Corporation. > > [ 0.927933] e1000e 0000:00:1f.6: enabling device (0000 -> 0002) > > [ 0.928249] e1000e 0000:00:1f.6: Interrupt Throttling Rate > > (ints/sec) set to dynamic conservative mode > > [ 1.155716] e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): > > registered PHC clock > > [ 1.220694] e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width > > x1) 24:fb:e3:bf:28:c6 > > [ 1.220721] e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network > Connection > > [ 1.220903] e1000e 0000:00:1f.6 eth0: MAC: 16, PHY: 12, PBA No: > FFFFFF-0FF > > [ 1.222632] e1000e 0000:00:1f.6 eno1: renamed from eth0 > > > > <<<cable connected>>> > > > > [ 153.932626] e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Half > > Duplex, Flow Control: None > > [ 153.934527] e1000e 0000:00:1f.6 eno1: NIC Link is Down > > [ 157.622238] e1000e 0000:00:1f.6 eno1: NIC Link is Up 1000 Mbps Full > > Duplex, Flow Control: None > > > > No error message seen after hot-plugging the Ethernet cable. > > > > Thank your for the report. > > We did not encounter this issue during our patch testing. However, we > will attempt to reproduce it in our lab. > > One detail that caught my attention is that flow control is disabled in > both scenarios. Could you please check whether the issue persists when > flow control is enabled? This might require connecting to a link partner > that supports flow control. > I wrote the other similar report from Dell Pro referenced earlier. Additional testing on the Dell provided the following insight: - A fast cable out/in will work. The cable should be disconnected for 10-15 seconds for the issue to trigger. - Using ethtool -r to renegotiate the link will make things work in the defunct state. And yes, my issue seems to be exactly the same. Thanks, Timo >
