On Wed, Nov 20, 2019 at 7:03 PM Mark Vitale <[email protected]> wrote: > Thank you for the backtraces. I agree that 'gm' is the problematic thread; > it appears to be stuck in rxi_WriteProc waiting for the Rx packet transmit > window > to advance. That is, it's waiting for acknowledgments - probably from the > fileserver.
It's true that the test was performed over wireless, however the same behaviour was encountered even when over GigaBit LAN. (This is a personal setup, both server, network and client, and there was light to no usage on both the client, server and the network.) > Unfortunately the rest of the backtrace seems muddled and so we can't tell > exactly > what the client was doing. In fact, many of the backtraces are incomplete. I haven't deleted anything from a particular process stacktrace. Although I have deleted processes that have nothing to do with AFS or didn't contain a stack which contained `afs`. (If you think it would be useful I can send you privately a complete, uncensored, output.) > If I have some time later this week, I may try to reproduce this issue. > However, there's no guarantee I will be able to do so, so it would be better > if we could either obtain more information from your site, or if you could > narrow the problem down to a simpler test case. I'll try to reproduce this without the actual build system. (Using say `stat`, `cp` and `xargs`.) > Do you have FileLogs and/or fileserver audit logs for the time in question? Yes, I do have access to them. The following is the syslog output from OpenAFS server in a 5 minute time-window to the stacktrace sent yesterday: ~~~~ FindClient: stillborn client 0x7fe9b0012dc0(77749fe8); conn 0x7fe9d800e390 (host 172.30.214.35:7001) had client 0x7fe9b00131d0(77749fe8) FindClient: stillborn client 0x7fe9b00132a0(77749fec); conn 0x7fe9d800e660 (host 172.30.214.35:7001) had client 0x7fe9b0012dc0(77749fec) FindClient: stillborn client 0x7fe9b0013030(77749fec); conn 0x7fe9d800e660 (host 172.30.214.35:7001) had client 0x7fe9b0012dc0(77749fec) FindClient: stillborn client 0x7fe9b0012cf0(77749fec); conn 0x7fe9d800e660 (host 172.30.214.35:7001) had client 0x7fe9b0012dc0(77749fec) ~~~~ No information is present in `/var/log/openafs` in that timeframe. The following are the arguments of `fileserver`: ~~~~ -syslog -sync always -p 4 -b 524288 -l 524288 -s 1048576 -vc 4096 -cb 1048576 -vhandle-max-cachesize 32768 -jumbo -udpsize 67108864 -sendsize 67108864 -rxmaxmtu 9000 -rxpck 4096 -busyat 65536 ~~~~ (Yesterday over wireless I didn't use Jumbo frames, but the day before, where the same thing happened, I was using them.) Ciprian. _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
