Re: [Pvfs2-developers] Re: the halloween bug fixed

Sam Lang Mon, 08 Oct 2007 12:57:53 -0700


On Oct 8, 2007, at 1:34 PM, Pete Wyckoff wrote:

[EMAIL PROTECTED] wrote on Mon, 08 Oct 2007 11:13 -0500:

The attached patch is the proposed fix for this problem.  When the
tcp method receives a disconnect from a peer, it invokes a callback
(bmi_method_addr_forget_callback) into the bmi control layer to
remove the address reference from the list.  Maybe I should also add
a counter and limit on how bit the list can get, al though that would
involve potentially forcing long-lived connections to reconnect
periodically, and all methods would have to implement BMI_set_info
(DROP_ADDR).

With tcp, new connections are registered, even if they are from the
same host/port on the peer, whereas the other methods seem to only
register new host/port endpoints that haven't been seen before.  So
its not completely clear to me when the other methods need to call
this callback, if at all.  There needs to be a matching
bmi_method_addr_forget_callback for each
bmi_method_addr_reg_callback, but if the method only registers a
single address per client, the list won't keep growing, unless we
ever plan to support millions of clients.

With gm, the address is registered with the control layer, and
managed internally as well (gm_addr_add).  The address is never
removed from the internal list though (gm_addr_del is never called).
Again, only new host/port pairs that haven't been seen are added to
the list, and registered with the control layer.  gm doesn't
implement the BMI_set_info(DROP_ADDR) call, so addresses are not
expunged even if requested explicitly.  My guess is we should
probably add a gm_addr_del for DROP_ADDR?

With ib, it looks like the server receives new connections and
registers them with the control layer, but the connections never get
closed, or the ib layer doesn't handle them?  The only place I can
find where connections are dropped is if an explicit BMI_set_info
(DROP_ADDR), which doesn't get called from the server.

With mx, it looks like there's a limit on the number of connections
from a peer (BMX_PEER_RX_NUM == 20).  As new connections are received
the idle connections are closed?  Should
bmi_method_addr_forget_callback be called from there?


I read your patch.  And read this mail twice, no thrice.  I'm
totally confused now.

Sorry Pete, I'm not trying to be purposefully arcane. I'll try toelaborate based on your questions.


Tell me which of these things are true/false, and perhaps add more
so we Scott and I can understand.

1. Address types

- core PVFS refers to peers using a 64-bit opaque BMI_addr_t


True.


- BMI has a list that maps BMI_addr_to to a method (e.g. mx or tcp)
    and a struct method_addr *

True. I've been referring to this part of the code as the bmicontrol layer (there's references to that name in the code). Itincludes the 'address reference list', which holds the ref_st_pinstances.


- BMI methods do not see BMI_addr_t.  They see struct method_addr *.
    That struct has a void * that is filled with method-specific
    items, like a connection structure.

True, although my patch changes that, because the address referencelist is accessed based on the PVFS_BMI_addr_t. Thebmi_method_addr_reg_callback function returns a PVFS_BMI_addr_t,which the method is meant to store, and when it comes time to callbmi_method_addr_forget_callback, it passes that PVFS_BMI_addr_t itstored.



2. Address creation

- apps ask for peers by name:  tcp://myhost:1223/pvfs

- these names are parsed by the appropriate method to return
    an existing or new struct method_addr *.  Methods are expected
    to keep a list of known peers?  IB does.

I think they all do, but I'm not sure that they have to. Thereference list stores the method_addr* as well, so the particularmethod could just throw everything in there before callingbmi_method_addr_reg_callback. If the method wants to iterate throughits known addresses though, it needs to manage them itself as well.


- servers can also receive new connections, in which case they
    must build a struct method_addr, and register it with core
    BMI, through bmi_method_addr_reg_callback()


True.



3. Address destruction

- currently never happens.  In theory, core BMI can request
    that a method drop an address, perhaps when out of resources.

Within the particular method (tcp being the one I'm looking at), theaddress is destroyed (tcp_forget_addr is called). But the addressreference is never being removed from the reference list. In thecase of tcp, this is a big problem (the bug in question), because tcpcalls bmi_method_addr_reg_callback for each new connection, not justeach new peer that connects.

The patch implements a complementary callback function:bmi_method_addr_forget_callback, to be called by the individualmethods when the connection is closed or the address is removed fromthe method's own management structures. In the case of tcp, aconnection is closed when poll or epoll returns an IN event on asocket (through testcontext) and the following read returns zero(EOF). Its at this point the method now (with my patch) callsbmi_method_addr_forget_callback to remove the address reference fromthe list.


- BMI_addr_t and struct method_addr and method-internal state are
    kept synchronized how?  Through which calls?  Who frees which
    things when?

On the server, PVFS_BMI_addr_t gets added to the reference list bybmi_method_addr_reg_callback. It gets removed from the list (with mypatch) by bmi_method_addr_forget_callback, before the method frees upthe address structures it was managing internally.



I'm obviously wrong and confused about #3.  And maybe 2b makes
a bad assumption.


What you've described sounds right.


Then how does your new forget callback fit in with all this?
Somehow I thought the endless list growth was purely a TCP problem,
not something that needed core plumbing.

I think its only a tcp problem because the tcp method callsbmi_method_addr_reg_callback for each new connection. It looks likethe other methods only call it for each new peer? The list justdoesn't get larger over time, so unless we plan to allow for millionsof clients, its not an issue for other methods. And in that case,they don't have to worry about calling the forget_callback. I wasassuming that they should in the interest of completeness.


-sam


                -- Pete


_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] Re: the halloween bug fixed

Reply via email to