Dear Salvatore,
I've already started bisecting. It will take some time. Usually the bug
appears after a few hours, unfortunately I am not able to trigger it
faster. So, if the bug appears, I can step forward easily, but if not,
its hard to decide if it is still present and simply just have not
occured, or if the current version is a good one. I'll try to do my
best.
I will also contact linux-nfs mailing list.
As I remember, it started nearly a year ago, when I switched to Debian's
kernel. I dont know exactly what version was at that time. Howewer, I've
checked Debian's patches, and I did not find anything related to NFS.
Regards,
Richard
2024-05-20 21:07 időpontban Salvatore Bonaccorso ezt írta:
Hi Richard,
On Mon, May 20, 2024 at 09:27:24AM +0000, Richard Kojedzinszky wrote:
Package: src:linux
Version: 6.1.90-1
Severity: normal
X-Debbugs-Cc: richard+debian+bugrep...@kojedz.in
Dear Maintainer,
I am running kubernetes on debian, and pods are mounting multiple nfs
shares. I am running dovecot processes in PODs, which receive mails
from
the internet, and also serves as imap server for clients. I am
monitoring my mail system by sending mails periodically (15 seconds)
and
also downloading them via imap. I found a few times that some dovecot
process
stuck in D state, a reboot was always needed to recover from that
state.
Unfortunately, I was not able to trigger the bug really fast, I dont
really know what operations does dovecot issue and in what order to
trigger
this behavior. So until I get closer, I've set up a similar, but
smaller
environment with just a single dovecot process, and it also does the
same work, delivering only test mails locally, and serving them via
imap
to the monitoring client, storing everything on NFS. Fortunately, this
also
triggers the bug, after a few hours one of the dovecot processes is
stuck
in D state. Kernel also shows blocked state:
As you seem in the lucky position to be able to trigger the issue in a
more localized setup, might you:
- try as well more recent kernels from upper suites (6.8.9-1 in
unstable would be ideal to check if the issue is there as well).
- I did read you cannot trigger with 5.15. If you build 6.1.90 from
upstream without Debian patches I assume you can trigger the issue
likewise? If so could you bisect the changes introducing the issue?
This is a cumbersome process in particular if you need few hours to
trigger it So maybe the following point could be done first:
- Can you report the issue to the linux-nfs list, keeping us in the
loop?
Regards,
Salvatore