Ciprian, > On Nov 19, 2019, at 4:37 PM, Ciprian Dorin Craciun > <[email protected]> wrote: > > On Tue, Nov 19, 2019 at 10:38 PM Ciprian Dorin Craciun > <[email protected]> wrote: >> At the following link you can find an extract of `dmesg` after the >> sysrq trigger. >> >> >> https://scratchpad.volution.ro/ciprian/f89fc32a0bbd0ae6d6f3edbbc3ee111c/b9c3bc4f795bbe9e7eaca93b0a57bea0.txt > > > I forgot to mention that in this case the CPU didn't go up to 100%, in > fact it was quite "quiet". (The 100% CPU seems to happen only after a > process "blocks" and I try to `SIGTERM` or `SIGKILL` it.)
Thank you for the backtraces. I agree that 'gm' is the problematic thread; it appears to be stuck in rxi_WriteProc waiting for the Rx packet transmit window to advance. That is, it's waiting for acknowledgments - probably from the fileserver. Unfortunately the rest of the backtrace seems muddled and so we can't tell exactly what the client was doing. In fact, many of the backtraces are incomplete. However, I can tell that all the cache manager kernel threads (housekeeping et al) are in a normal/idle state. I did some preliminary searches through the linux kernel git repo for recent changes in 5.3.9 and older, but didn't see anything that seemed relevant. If I have some time later this week, I may try to reproduce this issue. However, there's no guarantee I will be able to do so, so it would be better if we could either obtain more information from your site, or if you could narrow the problem down to a simpler test case. Do you have FileLogs and/or fileserver audit logs for the time in question? Thanks, -- Mark Vitale Sine Nomine Associates 20 Years of Customer Success _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
