Re: [Pvfs2-developers] Re: pointer aliasing and interface->set_info semantics

Phil Carns Wed, 05 Mar 2008 07:59:50 -0800

I don't know if IB or MX need to use thebmi_method_addr_forget_callback() function. That function makes alittle more sense in the context of a particular tcp problem:

- each time a client opens a new tcp socket to a server, the servercreates a bmi_addr corresponding to that socket (so it can sendresponses, etc.).- if that client exits, then reconnects, the server just thinks of thatas an entirely new bmi_addr; it doesn't have any way to realize that itis the same client connecting again using a different socket.

bmi_tcp therefore has to garbage collect old bmi_addr's when socketsclose, otherwise the number of addresses can grow indefinitely for along running tcp pvfs2-server (a problem Sam found a while back).

Ideally, when bmi_tcp figures out that a socket is closed, it wouldgarbage collect immediately and get rid of the addr. However, theserver could still have pending operations for that bmi_addr. So... wewe mark that addr as in an error state and try to hang onto it until thereference count hits zero before garbage collecting. That let's usreport a more meaningful error on the server side for pending operationsthan "addr doesn't exist".

The bmi_method_addr_foget_callback() in this case is a way to poke theupper level bmi code to say "keep an eye on this addr, and when therefcount hits zero clean it up for me".

I don't know if that description helps any, but that's my interpretationof what it does :) The address management (in general) in bmi has endedup being pretty wacky.

The DROP_ADDR function is how the bmi.c layer explicitly tells a bmimethod to get rid of an address (if that action makes any sense for themethod in question). So that part needs to really get rid of theaddress if necessary rather than handing it back to bmi.c with thebmi_method_addr_forget_callback().


-Phil

Scott Atchley wrote:

On Mar 4, 2008, at 6:58 PM, Pete Wyckoff wrote:

[EMAIL PROTECTED] wrote on Tue, 04 Mar 2008 17:35 -0600:

It looks like the IB BMI layer is ending up double-freeing themethod_addr
structure on the BMI_ib_set_info function, but it only happens when the
Metadata server is also a data server.
If you look at the following GDB output, the last two entries havethe samemethod_addr, and I can't figure out a good way to tell inBMI_set_info ifthe method_address has already been freed. It also looks like theid_string
has been mangled or freed somewhere earlier as well.


All your deadref were different values there, so I'm not seeing the
double-free aspect.  But I have no doubt that you're on to something
in here.  Also, at this location, the id_string and method_addr have
already been freed, so we shouldn't count on them having reasonable
values in them.

I've always had a hard time keeping these references straight.  Can
you verify that you're getting to these spots via dealloc_ref_st(),
and maybe a couple steps up from there, for sanity?

Trying to figure out what other devices do in the DROP_ADDR handler.
MX goes and calls bmi_method_addr_forget_callback() in there, but
that doesn't seem right, as it will just wind around through
dealloc_ref_st() again.  It looks like TCP is doing more or less
what IB is doing.


Pete,

I do not see where bmi_ib uses bmi_method_addr_forget_callback() at all.I am looking at the tcp code and I do need to fix how/where I use theabove.


Scott
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers


_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] Re: pointer aliasing and interface->set_info semantics

Reply via email to