Bug#1023563: linux-image-5.10.0-19-amd64: Ephemeral ports are reused too quickly, even when net.ipv4.tcp_tw_reuse = 0

2022-11-06 Thread Markus Wernig
Package: linux-image-5.10.0-19-amd64
Version: 5.10.149-2
Severity: important

Dear Maintainer,

Starting with linux-image-5.10.0-15-amd64 (5.10.120-1), it seems that
the kernel is reusing ephemeral tcp ports too quickly, even if
net.ipv4.tcp_tw_reuse is set to 0.

linux-image-5.10.0-14-amd64 (5.10.113-1) and all earlier versions did
not show that behaviour.

The behaviour is the same for IPv4 and IPv6.

* What led up to the situation?

I have a couple of medium-to-fairly busy web servers that open TCP
sessions (~15-20 new connections per second) to a dedicated port on a backend 
server. 
The connections are short-lived and terminated by the backend server
after 1 second on average.
This setup has been working for many years through many Debian releases
and kernel versions.

On July 2 2022 I updated (apt update) the systems, which upgraded the
linux kernel image from 5.10.0-14 to 5.10.0-15. 

Shortly afterwards I noticed an increasing number of connection errors
being reported by the web servers (timeouts).

Further analysis (mostly with tcpdump) showed that the web servers
had started reusing ephemeral TCP ports as shortly as 30 seconds after their
last use. At that time (30 sec) the backend server (which is also Debian) still
had the corresponding sockets in the TIME_WAIT status and replied to the
new SYN packet with an ACK instead of a SYN ACK (this is of course
normal behaviour, since the socket was still open). The web server did
not expect the ACK and discarded it, occasionally resending the SYN,
until a timeout occurred.

The choice of ephemeral source ports appeared quite erratic. For some
seconds they were chosen in ascending order as expected, then
seemed to jump back to some lower position, proceed in ascending order
from there again, then jump back to the higher position from where they
had left off before etc.

* What exactly did you do (or not do) that was effective (or
  ineffective)?

I first raised the port range for the ephemeral ports by setting
net.ipv4.ip_local_port_range=1024 60999 (from the default 32768 60999).
This alleviated the situation (so that the timeouts became less
frequent), but did not solve the problem.

I then set net.ipv4.tcp_tw_reuse = 0 (from the default 2), which did not
change anything (as is expected in this case).

* What was the outcome of this action?

None of the measures I took proved effective. 

So I downgraded the kernel to 5.10.0-14, and the problem immediately
went away. The web servers now cycle through the available ~6
ephemeral ports and come around to reusing them long after the socket
on the backend server has been closed.


I am opening this bug here because I am not knowledgeable enough about
the Debian kernel patches to decide whether or not this issue is already
present in the upstream vanilla kernel.

Thank you for looking into this.

Best regards

Markus Wernig

-- System Information:
Debian Release: 11.5
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-14-amd64 (SMP w/4 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL 
set to en_US.utf8), LANGUAGE=en_US:en
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages linux-image-5.10.0-14-amd64 depends on:
ii  initramfs-tools [linux-initramfs-tool]  0.140
ii  kmod28-1
ii  linux-base  4.6

Versions of packages linux-image-5.10.0-14-amd64 recommends:
ii  apparmor 2.13.6-10
ii  firmware-linux-free  20200122-1

Versions of packages linux-image-5.10.0-14-amd64 suggests:
pn  debian-kernel-handbook  
ii  grub-pc 2.06-3~deb11u2
pn  linux-doc-5.10  



Bug#1022126: mpt3sas broken with xen dom0

2022-11-06 Thread Radoslav Bodó
to be a bit more precise (sorry for being a bit sloopy). we don't see 
same error with `swiotlb buffer` but rather


```
mpt3sas_cm0: failure at 
drivers/scsi/mpt3sasg/mpt3sas_scsih.c:11069/_scsih_probe()!

```
as reported in `1023...@bugs.debian.org`



```
# dmesg | grep mpt
[0.233229]   Normal   empty
[0.233231]   Device   empty
[0.642600] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
[0.642676] TAA: Vulnerable: Clear CPU buffers attempted, no microcode
[0.642752] MMIO Stale Data: Vulnerable: Clear CPU buffers attempted, 
no microcode

[3.243654] mpt3sas version 35.100.00.00 loaded
[3.244357] mpt3sas_cm0: 32 BIT PCI BUS DMA ADDRESSING SUPPORTED, 
total mem (917052 kB)
[3.302439] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default 
host page size to 4k

[3.302575] mpt3sas_cm0: MSI-X vectors supported: 96
[3.302733] mpt3sas_cm0:  0 40
[3.308666] mpt3sas_cm0: High IOPs queues : disabled
[3.308747] mpt3sas0-msix0: PCI-MSI-X enabled: IRQ 396
[3.308826] mpt3sas0-msix1: PCI-MSI-X enabled: IRQ 397
[3.308905] mpt3sas0-msix2: PCI-MSI-X enabled: IRQ 398
[3.308983] mpt3sas0-msix3: PCI-MSI-X enabled: IRQ 399
[3.309062] mpt3sas0-msix4: PCI-MSI-X enabled: IRQ 400
[3.309141] mpt3sas0-msix5: PCI-MSI-X enabled: IRQ 401
[3.309221] mpt3sas0-msix6: PCI-MSI-X enabled: IRQ 402
[3.309224] mpt3sas0-msix7: PCI-MSI-X enabled: IRQ 403
[3.309466] mpt3sas0-msix8: PCI-MSI-X enabled: IRQ 404
[3.309544] mpt3sas0-msix9: PCI-MSI-X enabled: IRQ 405
[3.309623] mpt3sas0-msix10: PCI-MSI-X enabled: IRQ 406
[3.309703] mpt3sas0-msix11: PCI-MSI-X enabled: IRQ 407
[3.309706] mpt3sas0-msix12: PCI-MSI-X enabled: IRQ 408
[3.309945] mpt3sas0-msix13: PCI-MSI-X enabled: IRQ 409
[3.310024] mpt3sas0-msix14: PCI-MSI-X enabled: IRQ 410
[3.310104] mpt3sas0-msix15: PCI-MSI-X enabled: IRQ 411
[3.310183] mpt3sas0-msix16: PCI-MSI-X enabled: IRQ 412
[3.310186] mpt3sas0-msix17: PCI-MSI-X enabled: IRQ 413
[3.310427] mpt3sas0-msix18: PCI-MSI-X enabled: IRQ 414
[3.310506] mpt3sas0-msix19: PCI-MSI-X enabled: IRQ 415
[3.310585] mpt3sas0-msix20: PCI-MSI-X enabled: IRQ 416
[3.310588] mpt3sas0-msix21: PCI-MSI-X enabled: IRQ 417
[3.310827] mpt3sas0-msix22: PCI-MSI-X enabled: IRQ 418
[3.310906] mpt3sas0-msix23: PCI-MSI-X enabled: IRQ 419
[3.310985] mpt3sas0-msix24: PCI-MSI-X enabled: IRQ 420
[3.310988] mpt3sas0-msix25: PCI-MSI-X enabled: IRQ 421
[3.316842] mpt3sas0-msix26: PCI-MSI-X enabled: IRQ 422
[3.316845] mpt3sas0-msix27: PCI-MSI-X enabled: IRQ 423
[3.317008] mpt3sas0-msix28: PCI-MSI-X enabled: IRQ 424
[3.317011] mpt3sas0-msix29: PCI-MSI-X enabled: IRQ 425
[3.317180] mpt3sas0-msix30: PCI-MSI-X enabled: IRQ 426
[3.317182] mpt3sas0-msix31: PCI-MSI-X enabled: IRQ 427
[3.317384] mpt3sas0-msix32: PCI-MSI-X enabled: IRQ 428
[3.317386] mpt3sas0-msix33: PCI-MSI-X enabled: IRQ 429
[3.317569] mpt3sas0-msix34: PCI-MSI-X enabled: IRQ 430
[3.317572] mpt3sas0-msix35: PCI-MSI-X enabled: IRQ 431
[3.317736] mpt3sas0-msix36: PCI-MSI-X enabled: IRQ 432
[3.317739] mpt3sas0-msix37: PCI-MSI-X enabled: IRQ 433
[3.317929] mpt3sas0-msix38: PCI-MSI-X enabled: IRQ 434
[3.317931] mpt3sas0-msix39: PCI-MSI-X enabled: IRQ 435
[3.318197] mpt3sas_cm0: iomem(0xac40), 
mapped(0xa634eda5), size(65536)

[3.318200] mpt3sas_cm0: ioport(0x6000), size(256)
[3.378171] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default 
host page size to 4k
[3.406933] mpt3sas_cm0: scatter gather: sge_in_main_msg(1), 
sge_per_chain(7), sge_per_io(128), chains_per_io(19)
[3.412485] mpt3sas_cm0: failure at 
drivers/scsi/mpt3sasg/mpt3sas_scsih.c:11069/_scsih_probe()!

```



Bug#1022126: mpt3sas driver does not load

2022-11-06 Thread Radoslav Bodó

Hello,

we are facing very same issue (kernel 5.10.149-2 does not load driver 
with same error message) on Dell R440 with ME4012 disk array attached 
via HBA


(taken from working 5.10.136-1)
[3.744036] mpt3sas_cm0: FW Package Ver(15.17.09.06)
[3.744740] mpt3sas_cm0: LSISAS3008: FWVersion(15.15.06.00), 
ChipRevision(0x02), BiosVersion(17.07.01.00)


we are also using xen 4.14.5+24-g87d90d511c-1

I've tried workaround found elsewhere (add kernel parameter 
mpt3sas.max_queue_depth=1) without success.


Any suggestion would be appreciated.
bodik



Bug#1023183: mpt3sas driver does not load

2022-11-06 Thread Radoslav Bodó

Hello,

we are facing very same issue (kernel 5.10.149-2 does not load driver 
with same error message) on Dell R440 with ME4012 disk array attached 
via HBA


(taken from working 5.10.136-1)
[3.744036] mpt3sas_cm0: FW Package Ver(15.17.09.06)
[3.744740] mpt3sas_cm0: LSISAS3008: FWVersion(15.15.06.00), 
ChipRevision(0x02), BiosVersion(17.07.01.00)


we are also using xen 4.14.5+24-g87d90d511c-1

I've tried workaround found elsewhere (add kernel parameter 
mpt3sas.max_queue_depth=1) without success.


Any suggestion would be appreciated.
bodik



Bug#992304: possible workaround

2022-11-06 Thread Claude Poitras
On Fri, 4 Nov 2022 20:55:37 + Dan Stefura  
wrote:
> Try using the kernel parameter:  intel_iommu=off

It's work for me on Dell T140 with  PERC H330 Adapter

root@zurix:~# uname -a
Linux zurix 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/
Linux



Re: RFC: bootloader/initramfs protocol v2

2022-11-06 Thread Luca Boccassi
On Tue, 2022-11-01 at 21:29 +0100, Bastian Blank wrote:
> [Cc Ben as he gave feedback to the last iteration, Luca as he wanted
> something actionable]
> 
> Hi folks
> 
> As I abondened the last try and also learned some new things in the
> meantime, I'd like to discuss another try at re-organizing how Debian
> does boot loaders and initramfs.  This mail mostly tries to get the
> goals
> and requirements straight.
> 
> Please provide feedback.  Also for missing stuff.
> 
> Regards,
> Bastian

Thank you, this is great.

> ## Goals
> 
> - Setup complete boot entries from packaged and generated files
> - Support dumb file systems for /boot by default, so boot loaders can
>   drop complex file system support.
> - Re-create stuff in /boot from scratch
> - Remove symlink handling from kernel package
> - Single entry point for packages and admins, aka no tool specific
>   "update-initramfs" anymore

Could you clarify what you mean by "single entry point" here? It's the
only point I can't quite decode. A trigger?

I would like to suggestion this as an additional, explicit goal, rather
than implicit:

End result should be fully compatible with the BLS (for the readers:
https://uapi-group.org/specifications/specs/boot_loader_specification/
)

Given we are doing something new, it's very well worth to be fully
compatible with cross-distro standards so that we can leverage existing
toolchains.

> ## Requirements
> 
> - Read package files in /usr
> - Uses observed state, not changes provided by maintainer scripts
> - Dumb writes to /boot. No rename, no sym, hard links. Can use ref
>   links if possible.
> - Generate initramfs if needed, creates new entry if re-generated
> - Keep older stuff (like previous kernel, initramfs) for a short
> while
> - Possible to support multiple targets, like grub, zipl, flash-kernel
> - Multiple inputs, plain kernel, UKI, also in one version
> - Combinations of inputs, Xen+Linux
> - Completely outside of kernel package
> - Backward compatible support for stuff packaged in /boot
> - Use config for kernel command line
> 
> ## Open questions
> 
> - How to select default entry if supported, just sort by version and
> use
>   newest?  This also works somewhat in BLS.

Yes, we should follow BLS on this, so that end result is predictable,
well-defined and doesn't vary wildly from other distros.

> - How to interface with boot loaders, just absorb all knowledge about
>   config and only run install tools if necessary (like with zipl as
>   block map based loader)?  grub might get support for BLS, at least
>   there exists a patch somewhere, that will make it easier, and we
> can
>   just iterate over config files defining one entry each.

Self-described and auto-discoverable images should be the primary mean.
For reference, the patch that adds support for the BLS to Grub is here:

https://github.com/osteffenrh/grub2/commit/d0c402c96159423242cf7b612773126ccc11a83b

UKIs will be proposed for Fedora, this should land as part of that work
and do a lot of the heavy lifting for us:

https://fedoraproject.org/wiki/Changes/Unified_Kernel_Support_Phase_1

In fact, if Grub can do UKIs, do we even need Type 1 entries (separate
textual config files) for anything at that point?

> ## Prior works
> 
> - Current Debian: change based, overwrites by dpkg in /boot,
> versioned
>   by ABI
> - systemd install-kernel: only BLS as target, which nothing used by
>   default in Debian can read
> More?

For reference, Debian images can be built using dracut + sd-boot +
kernel-install, images built with mkosi work like that:

https://github.com/systemd/mkosi

> ## File system layout
> 
> Some initial ideas about how stuff could look.
> 
> ### Boot file system (/boot)
> 
> This file system might be shared, so everything is somewhat
> referenced
> to the machine id.  This should be somewhat compatible with BLS (type
> 1).

This just dropped talking about lots of these concepts:

https://0pointer.net/blog/linux-boot-partitions.html

> * /boot/$machineid/
>   * ./grub/: config snippets, so we can do "no overwrite"
> 
> ### Distribution file system (/usr)
> 
> * /usr/lib/boot/$package(_$modifier)/
>   * ./data: raw data for item
>   * ./metadata: info about item in undetermined format

What would 'metadata' be in this context?

-- 
Kind regards,
Luca Boccassi


signature.asc
Description: This is a digitally signed message part


Bug#778849: Is "wishlist" appropriate for this?

2022-11-06 Thread Joseph Carter
I have yet to investigate intrigeri's suggestions from 2017, however I would 
suggest that this is something that needs to be upgraded from wishlist in 2022, 
and here's the reason simply enough:

root@aki:~# nvme smart-log /dev/nvme0
Smart Log for NVME device:nvme0 namespace-id:
[..]
unsafe_shutdowns: 106
[..]
num_err_log_entries : 284
[..]
root@aki:~# nvme smart-log /dev/nvme1
Smart Log for NVME device:nvme1 namespace-id:
[..]
unsafe_shutdowns: 121
[..]
num_err_log_entries : 291
[..]

Given that the frequency and number of SMART errors are deemed an indicator of 
drive health, that's bad. Also, improper shutdown on NVMe devices could be 
particularly problematic because they have caches and wear leveling and cleanup 
cycles that could happen any time the drive is "running" until a shutdown 
command is issued and responded to. There might actually be some risk of data 
corruption/loss. (I doubt it with commodity consumer SSDs, but Debian isn't 
just run on those.)

For a few weeks, we tried on #debian to sort out the cause of the above errors. 
We thought NVMe drive quirk Linux doesn't support? Maybe Linux is issuing the 
shutdown command and not waiting long enough? There's Google bait suggesting 
that's a problem, and there's some BS factoids in dpkg I should remove the next 
time I connect to OFTC describing the "solution" which I've since discovered 
doesn't work. This was hard to test because obviously no logger is running at 
this point of the shutdown process.

The root cause of the problem isn't an unknown quirk, it's that I have LVM on 
LUKS. (See what I did there?) Connected a drive with an unencrypted Debian 
system on it that mounted my main installation's /boot and even the LUKS/LVM 
root somewhere and never got a single unsafe shutdown despite multiple 
reboots/shutdowns. Because that temp install's root was not on LVM on LUKS 
backing.

Dracut is a suboptimal solution. In part because after three days of trying to 
get it to boot my system, I've yet to see it do so, and because while there's 
lots of documentation for it, it's for other distributions, it's wrong, it's 
obsolete, or it's misleading. Including one rantthrough from 2017 that offers a 
profanity-laden survey of most of the others and why they don't work for Debian 
systems or at all.

As far as I can tell you either need to significantly modify grub or switch to 
systemd-boot or set up Dracut to generate an EFI executable blob using files 
that aren't available on a Debian system or throw up my hands and go use Fedora 
until I understand Dracut enough to try and use it on Debian. Or something. 
Again: What sparse documentation exists is spotty, inconsistent, and at least 
five years out of date. Dracut is not how Debian does things, just like OpenRC 
and rEFInd are not how Debian does things. That's all there if you want to set 
it up, but you're not going to find many Debian resources on using it.

I think unsafe shutdowns of NVMe devices is actually a bug. And I think it 
could cause data loss or corruption on more advnaced hardware than I'm using. 
There's a few options for addressing it and most of them become problems beyond 
initramfs-tools' scope. But this seven year old bug might be the path of least 
resistance.

Joseph



Bug#1023103: linux-image-5.10.0-14-amd64: Wrong power value on APU2 with bullseye

2022-11-06 Thread Alexis Domjan

Hi Salvatore,


Thank you for testing. And before 5.10.113 can you pin point to a
version in which showed sensible versions? If so you have a starting
point to bisect where the problem is introduced which would be very
helpfull to determine the pontential issue.


A friend of mine did these tests and here are the results:

4.19.260 HAS NOT the bug
4.19.264 HAS NOT the bug
5.0.1 HAS the bug
5.1   HAS the bug

So it looks like the bug appeared in kernel 5.

Kind regards,
Alexis Domjan