To: [email protected] From: [email protected] Subject: Unexpected pthread preemption Package: linux-source-6.12 Version: 6.12.48-amd64 OS: Debian/Trixie
Overview: We are seeing with Debian Trixie kernel that pthreads are being preempted unexpectedly. In our work model for our application, we isolate a number of cores from the OS such that our application is the only thing running on these isolated cores. All of the pthreads are set up as SCHED_FIFO and running at the same priority such that it should be up to the pthread when it wants to allow preemption. We also run mlockall(MCL_CURRENT|MCL_FUTURE) to lock all of the memory in this application. Additionally, as SO modules are loaded we make sure all pages from these modules are pre-faulted in. The application is running with ulimit lock limit == infinity. Previously in Debian/buster, we had to add vm.compact_unevictable_allowed = 0 since the default setting was causing unexpected eviction which led to similar behaviors. We have confirmed that this setting is still set to zero. In debugging this, we found that a pthread was transitioned out due to prev_state=D. In looking at what was happening at that point it was determined that it was a page fault due to the instruction it was trying to run. In this case the faulting instruction would have ran numerous times by this point so there was not reason for it to have to fault in this page. We have retested using bookworm kernel and are not seeing this issue. I performed an attempt at isolating this issue. I disabled CONFIG_TRANSPARENT_HUGEPAGE but still hit the issue. I then disabled CONFIG_COMPACTION and now have ran for nearly 72hrs without a failure (previously we would see failures in under 15hrs). Unfortunately shutting off COMPACTION is not something we want to do but it at least appears to prove that something in that realm changed which is causing this issue. Since the 6.1 kernel works for us and 6.12 is what fails, it will take some time to examine the changes in between to determine if a particular commit is causing this issue. David Hoyer

