Ciprian,

> On Nov 19, 2019, at 4:37 PM, Ciprian Dorin Craciun 
> <[email protected]> wrote:
> 
> On Tue, Nov 19, 2019 at 10:38 PM Ciprian Dorin Craciun
> <[email protected]> wrote:
>> At the following link you can find an extract of `dmesg` after the
>> sysrq trigger.
>> 
>>  
>> https://scratchpad.volution.ro/ciprian/f89fc32a0bbd0ae6d6f3edbbc3ee111c/b9c3bc4f795bbe9e7eaca93b0a57bea0.txt
> 
> 
> I forgot to mention that in this case the CPU didn't go up to 100%, in
> fact it was quite "quiet".  (The 100% CPU seems to happen only after a
> process "blocks" and I try to `SIGTERM` or `SIGKILL` it.)

Thank you for the backtraces.  I agree that 'gm' is the problematic thread;
it appears to be stuck in rxi_WriteProc waiting for the Rx packet transmit 
window
to advance.  That is, it's waiting for acknowledgments - probably from the 
fileserver.
Unfortunately the rest of the backtrace seems muddled and so we can't tell 
exactly
what the client was doing.  In fact, many of the backtraces are incomplete.
However, I can tell that all the cache manager kernel threads (housekeeping et 
al)
are in a normal/idle state.

I did some preliminary searches through the linux kernel git repo for recent 
changes
in 5.3.9 and older, but didn't see anything that seemed relevant.

If I have some time later this week, I may try to reproduce this issue.  
However, there's no guarantee I will be able to do so, so it would be better
if we could either obtain more information from your site, or if you could
narrow the problem down to a simpler test case.

Do you have FileLogs and/or fileserver audit logs for the time in question?

Thanks,
--
Mark Vitale
Sine Nomine Associates
20 Years of Customer Success



_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to