On 6/18/2025 4:41 PM, Christian Heusel wrote:
On 25/06/18 03:28PM, Marek Marczykowski-Górecki wrote:
On Fri, May 09, 2025 at 02:17:32AM +0200, Marek Marczykowski-Górecki wrote:
On Fri, May 09, 2025 at 01:28:36AM +0200, Marek Marczykowski-Górecki wrote:
On Fri, May 09, 2025 at 01:13:28AM +0200, Paul Menzel wrote:
Dear Marek, dear Vitaly,
Am 09.05.25 um 00:41 schrieb Marek Marczykowski-Górecki:
On Thu, May 08, 2025 at 09:26:18AM +0300, Lifshits, Vitaly
On 4/21/2025 4:28 PM, Marek Marczykowski-Górecki wrote:
On Mon, Apr 21, 2025 at 03:19:12PM +0200, Marek Marczykowski-Górecki wrote:
On Mon, Apr 21, 2025 at 03:44:02PM +0300, Lifshits, Vitaly wrote:
On 4/16/2025 3:43 PM, Marek Marczykowski-Górecki wrote:
On Wed, Apr 16, 2025 at 03:09:39PM +0300, Lifshits, Vitaly wrote:
Can you please also share the output of ethtool -i? I would like to know the
NVM version that you have on your device.
driver: e1000e
version: 6.14.1+
firmware-version: 1.1-4
expansion-rom-version:
bus-info: 0000:00:1f.6
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
Your firmware version is not the latest, can you check with the board
manufacturer if there is a BIOS update to your system?
I can check, but still, it's a regression in the Linux driver - old
kernel did work perfectly well on this hw. Maybe new driver tries to use
some feature that is missing (or broken) in the old firmware?
A little bit of context: I'm maintaining the kernel package for a Qubes
OS distribution. While I can try to update firmware on my test system, I
have no influence on what hardware users will use this kernel, and
which firmware version they will use (and whether all the vendors
provide newer firmware at all). I cannot ship a kernel that is known
to break network on some devices.
Also, you mentioned that on another system this issue doesn't reproduce, do
they have the same firmware version?
The other one has also 1.1-4 firmware. And I re-checked, e1000e from
6.14.2 works fine there.
Thank you for your detailed feedback and for providing the requested
information.
We have conducted extensive testing of this patch across multiple systems
and have not observed any packet loss issues. Upon comparing the mentioned
setups, we noted that while the LAN controller is similar, the CPU differs.
We believe that the issue may be related to transitions in the CPU's low
power states.
Consequently, we kindly request that you disable the CPU low power state
transitions in the S0 system state and verify if the issue persists. You can
disable this in the kernel parameters on the command line with idle=poll.
Please note that this command is intended for debugging purposes only, as it
may result in higher power consumption.
I tried with idle=poll, and it didn't help, I still see a lot of packet
losses. But I can also confirm that idle=poll makes the system use
significantly more power (previously at 25-30W, with this option stays
at about 42W).
Is there any other info I can provide, enable some debug features or
something?
I see the problem is with receiving packets - in my simple ping test,
the ping target sees all the echo requests (and respond to them), but
the responses aren't reaching ping back (and are not visible on tcpdump
on the problematic system either).
As the cause is still unclear, can the commit please be reverted in the
master branch due adhere to Linux’ no-regression policy, so that it can be
reverted from the stable series?
Marek, did you also test 6.15 release candidates?
The last test I did was on 6.15-rc3. I can re-test on -rc5.
Same with 6.15-rc5.
And the same issue still applies to 6.16-rc2. FWIW Qubes OS kernel has
this buggy patch revered and nobody complained (contrary to the version
with the patch included). Should I submit the revert patch?
It is not a good idea to revert this patch as most of the systems will
encounter the original issues (PHY access and packet loss). The reason I
first introduced this patch was because big vendors reported the packet
loss issue. You can refer to the following sightings:
https://answers.launchpad.net/ubuntu/+question/816003
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2066064
https://bugzilla.kernel.org/show_bug.cgi?id=218869
As an intermediate solution we can either use a privileged flag to make
it configurable. I will share with you a patch that might fix the issue
on your system that I would like you to try.
FYI, we are currently investigating a similar issue that seems to be due
to a misconfiguration of the system firmware.
Just submit a revert then 👍 I have no authority here, but had good
experience with just sending a revert patch in the past 🤗
Cheers,
Chris