To: [email protected]
From: [email protected]
Subject: Unexpected pthread preemption
Package: linux-source-6.12
Version: 6.12.48-amd64 
OS: Debian/Trixie

Overview:
We are seeing with Debian Trixie kernel that pthreads are being preempted 
unexpectedly.   In our work model for our application, we isolate a number of 
cores from the OS such that our application is the only thing running on these 
isolated cores.   All of the pthreads are set up as SCHED_FIFO and running at 
the same priority such that it should be up to the pthread when it wants to 
allow preemption.   We also run mlockall(MCL_CURRENT|MCL_FUTURE) to lock all of 
the memory in this application.  Additionally, as SO modules are loaded we make 
sure all pages from these modules are pre-faulted in.   The application is 
running with ulimit lock limit == infinity.

Previously in Debian/buster, we had to add vm.compact_unevictable_allowed = 0 
since the default setting was causing unexpected eviction which led to similar 
behaviors.   We have confirmed that this setting is still set to zero.

In debugging this, we found that a pthread was transitioned out due to 
prev_state=D.   In looking at what was happening at that point it was 
determined that it was a page fault due to the instruction it was trying to 
run.  In this case the faulting instruction would have ran numerous times by 
this point so there was not reason for it to have to fault in this page.

We have retested using bookworm kernel and are not seeing this issue.

I performed an attempt at isolating this issue.  I disabled 
CONFIG_TRANSPARENT_HUGEPAGE but still hit the issue.   I then disabled 
CONFIG_COMPACTION and now have ran for nearly 72hrs without a failure 
(previously we would see failures in under 15hrs).  Unfortunately shutting off 
COMPACTION is not something we want to do but it at least appears to prove that 
something in that realm changed which is causing this issue.

Since the 6.1 kernel works for us and 6.12 is what fails, it will take some 
time to examine the changes in between to determine if a particular commit is 
causing this issue.

David Hoyer

Reply via email to