On Sep 21, 2006, at 11:23 AM, Pete Wyckoff wrote:
[EMAIL PROTECTED] wrote on Thu, 21 Sep 2006 10:19 -0400:
If I use a URI of mx://hostname:board:endpoint for each peer, I am
wondering how the local bmi_mx gets its own name (in order to know
which NIC to use and which MX endpoint).
Looking through bmi.c, activate_method() calls
BMI_meth_method_addr_lookup(listen_addr) before calling
BMI_meth_initialize(). The function activate_method() is called from
BMI_initialize() which indicates that listen_addr is "a comma
separated list of addresses to listen on for each method." Should I
assume that the following will happen to startup bmi_mx:
new_addr = BMI_meth_method_addr_lookup("mx://
myhostname:myboard:myport");
...
BMI_meth_initialize(new_addr, ...)
Is this a correct assumption for the method_addr_lookup() and order
of operations?
Yes on the assumption, thus BMI_mx_method_addr_lookup() should not
look at any state that needs to be initialized in
BMI_meth_initialize. It just munges strings. The IB implementation
relies on a static variable to check if it has been initialized or
not.
I'm not thoroughly happy with this situation. If it is a major
problem for MX, speak up. We can fix the API.
I do not foresee a problem yet, but if one comes up, I'll let you
know. :-)
If so, then my method_addr_lookup() function has to check if bmi_mx
has been initialized before using any lists, locks, etc., no?
In BMI_mx_initialize, do whatever basic NIC-independent setup you
need to do. If called with a non-NULL listen_addr, act as a server
and listen on the device passed in. To do that you'll have to
open the particular board in the address. Ideally you'd have a way
of listening on all boards since you don't know from where your
clients will come yet.
And the non-NULL will be a mx://... string, no?
As for listening on all boards, that is not supported by MX directly
(a MX endpoint is specific to a NIC). If this is a requirement of
PVFS (to be to use multiple Myricom NICs), then we have a couple of
options. First, can PVFS open two BMI methods of the same type? If
so, then we just have to ensure that all bmi_mx state is local to
each process (no globals, etc.). If BMI cannot open two of the same
type, then I will have to manage two (or more) MX endpoints in bmi_mx.
That then raises the issue of whether the NICs are in the same fabric
or in disjoint fabrics. If the same, then I can send (and receive)
using either one and we will need to consider some form of striping
over the NICs to ensure that we maximize utilization of both. If they
are disjoint, I will need to maintain separate peer lists for each
and determine which NIC needs to send and receive for a given peer.
Using more than one board definitely complicates matters. ;-)
Initially, I will support one board to get things going.
On a client, you don't know at initialize time to what server(s)
you'll need to connect. Somehow you'll have to prepare the device
as needed in the first sendunexpected call.
Does this work?
-- Pete
I am not worried about peer (i.e. server) state at initialize time.
Sending to others is not an issue as long as the URI is passed as
part of the send context. Does the client also get a listen_addr
string or not? I would assume not. If so, I am then free to open any
endpoint I want and then I would pass that info to the server before
sending my first sendunexpected message (i.e. in my method's connect
request message).
If the client does not get a listen_addr URI and if the machine has
mutiple NICs, I will not know which one to open. I could have a
#define that compiles in the board number but it would require that
all machines use the same board. Any suggestions?
Scott
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers