On Oct 9, 2007, at 8:27 AM, Pete Wyckoff wrote:
[EMAIL PROTECTED] wrote on Mon, 08 Oct 2007 14:56 -0500:
True, although my patch changes that, because the address reference
list is accessed based on the PVFS_BMI_addr_t. The
bmi_method_addr_reg_callback function returns a PVFS_BMI_addr_t,
which the method is meant to store, and when it comes time to call
bmi_method_addr_forget_callback, it passes that PVFS_BMI_addr_t it
stored.
[..]
Within the particular method (tcp being the one I'm looking at), the
address is destroyed (tcp_forget_addr is called). But the address
reference is never being removed from the reference list. In the
case of tcp, this is a big problem (the bug in question), because tcp
calls bmi_method_addr_reg_callback for each new connection, not just
each new peer that connects.
Thanks for all the explanation. The crucial difference with TCP
that I wasn't grokking was that it doesn't search its own internal
peer list---it always registers each new connection.
Thus your "forget" approach seems good. Except for one aspect. Why
force the method to store the PVFS_BMI_addr_t just so it can hand it
back to BMI core, which then convers it into a struct method_addr?
Can you just pass the struct method_addr directly? If not, no big
deal.
Its not converting it to method_addr. I'm doing a lookup into the
reference list based on the PVFS_BMI_addr_t, and getting back a
ref_st_p.
More generally, it bugs me that both core BMI and each method must
keep separate lists of addresses. It's probably time to expose the
data structure to BMI methods so we have just one list. But this is
certainly more than you set out to do.
Yeah, but I agree its a mess. As we head down the path of multiple
methods enabled though, it seems like we will want to allow an
individual method to get at its own peer/connected addresses easily,
without having to iterate through a list where another method has a
bunch of addresses already.
One alternative might be to throw out the address management (this
reference list) in the bmi control layer, in favor of forcing methods
to manage their own (since most of them do anyway), and instead of
creating PVFS_BMI_addr_t values from id_gen_fast_register (a hash of
the reference pointer), we could come up with a scheme that splits
the 64bit value into a method type and an address value that the
method returns. I think that would allow us to keep with the
interface layering that we have now, although it would require some
address management in the tcp method (and possibly others).
-sam
-- Pete
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers