On Wed, 2 Mar 2011 00:04:50 -0600 Andrew Deason <[email protected]> wrote:
> Looking at it a bit more... one thing that seems odd is that we don't > ever seem to cancel the GrowMTU event. Shouldn't we be doing that in > FreeCall/ResetCall/EndCall somewhere? It seems like we could have some > other event go the CheckCall->FreeCall->DestroyConn route while the > GrowMTUEvent is still pending, and when the GrowMTUEvent fires, it > follows the same path and frees the conn again. That wouldn't be a > problem in the pthreaded case because we check the call refs before > freeing in CheckCall. I'm inferring Derrick agrees with this, from gerrit 4108 :). Ryan, if you would like to try this patch: <http://git.openafs.org/?p=openafs.git;a=commitdiff_plain;h=f82277b98404bc35a28e4d9ae2d084e37b3f9d7c> (it will apply with some line offsets) It would be nice to see if that solves the issue. > I still wouldn't understand how you can reproduce this so easily, > though, when I am unable. We can probably give you some gdb > breakpoints and stuff to run, to see what events are triggering for > the conn. If it comes to that, anyway, and you're willing try running > it again under gdb until the problem recurs (but that's apparently not > a very long time, heh). This is still curious, though. If you want to, I'd be interested in you running vlserver or ptserver under gdb without the above patch, first. Attach soon after it starts up, and run: set height 0 break rxi_DestroyConnection commands print *conn print conn bt cont end break rxi_CleanupConnection commands print *conn print conn bt cont end break rxi_FreeCall commands print *call print call bt cont end cont And when it crashes, just save the output (along with the 'bt' of the crash), and put it somewhere we can get to it. -- Andrew Deason [email protected] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
