Re: [Pvfs2-developers] Re: the halloween bug fixed

Sam Lang Tue, 09 Oct 2007 06:40:58 -0700


On Oct 9, 2007, at 8:27 AM, Pete Wyckoff wrote:

[EMAIL PROTECTED] wrote on Mon, 08 Oct 2007 14:56 -0500:

True, although my patch changes that, because the address reference
list is accessed based on the PVFS_BMI_addr_t.  The
bmi_method_addr_reg_callback function returns a PVFS_BMI_addr_t,
which the method is meant to store, and when it comes time to call
bmi_method_addr_forget_callback, it passes that PVFS_BMI_addr_t it
stored.

[..]

Within the particular method (tcp being the one I'm looking at), the
address is destroyed (tcp_forget_addr is called).  But the address
reference is never being removed from the reference list.  In the
case of tcp, this is a big problem (the bug in question), because tcp
calls bmi_method_addr_reg_callback for each new connection, not just
each new peer that connects.


Thanks for all the explanation.  The crucial difference with TCP
that I wasn't grokking was that it doesn't search its own internal
peer list---it always registers each new connection.

Thus your "forget" approach seems good.  Except for one aspect.  Why
force the method to store the PVFS_BMI_addr_t just so it can hand it
back to BMI core, which then convers it into a struct method_addr?
Can you just pass the struct method_addr directly?  If not, no big
deal.

Its not converting it to method_addr. I'm doing a lookup into thereference list based on the PVFS_BMI_addr_t, and getting back aref_st_p.


More generally, it bugs me that both core BMI and each method must
keep separate lists of addresses.  It's probably time to expose the
data structure to BMI methods so we have just one list.  But this is
certainly more than you set out to do.

Yeah, but I agree its a mess. As we head down the path of multiplemethods enabled though, it seems like we will want to allow anindividual method to get at its own peer/connected addresses easily,without having to iterate through a list where another method has abunch of addresses already.

One alternative might be to throw out the address management (thisreference list) in the bmi control layer, in favor of forcing methodsto manage their own (since most of them do anyway), and instead ofcreating PVFS_BMI_addr_t values from id_gen_fast_register (a hash ofthe reference pointer), we could come up with a scheme that splitsthe 64bit value into a method type and an address value that themethod returns. I think that would allow us to keep with theinterface layering that we have now, although it would require someaddress management in the tcp method (and possibly others).


-sam


                -- Pete


_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] Re: the halloween bug fixed

Reply via email to