Scott,
I compiled and ran your test program (with a few small errors
corrected):
#include <stdlib.h>
#include <stdio.h>
#include <inttypes.h>
#include "myriexpress.h"
int main(int argc, char *argv[]) {
uint32_t ep_id, sid, result = 0;
uint64_t nic_id;
mx_return_t ret;
mx_endpoint_t ep;
mx_endpoint_addr_t epa;
mx_request_t request;
mx_status_t status;
mx_init();
ret = mx_open_endpoint(0, 0, 0, NULL, 0, &ep); /* use the first
NIC, open endpoint 0, filter = 0, no params */
if (ret) {
printf("open_endpoint() returned %s\n", mx_strerror(ret));
exit(1);
}
ret = mx_get_endpoint_addr(ep, &epa);
if (ret) {
printf("get_endpoint_addr() returned %s\n", mx_strerror(ret));
exit(1);
}
ret = mx_decompose_endpoint_addr2(epa, &nic_id, &ep_id, &sid);
if (ret) {
printf("mx_decompose_endpoint_addr2() returned %s\n",
mx_strerror(ret));
exit(1);
}
ret = mx_iconnect(ep, nic_id, 0, 0, 0, NULL, &request);
if (ret) {
printf("iconnect() returned %s\n", mx_strerror(ret));
exit(1);
}
do {
ret = mx_test(ep, &request, &status, &result);
if (result)
printf("iconnect completed with status %s\n",
mx_strstatus(status.code));
} while (!result);
mx_finalize();
return 0;
}
Running it is successful:
jrandall@begbie:/tmp$ ./omxtestself
iconnect completed with status Success
However, I'm not really sure what this is testing -- it seems like it
is only opening one endpoint, looking up that endpoint address, and
then trying to connect to it. I thought the failure I was observing
was that when the client (on one endpoint) tried to connect to the
server (on another endpoint?) on the same host. Am I mistaken about
that? In any case, I've tried to extend your test to cover that
scenario:
#include <stdlib.h>
#include <stdio.h>
#include <inttypes.h>
#include "myriexpress.h"
int main(int argc, char *argv[]) {
uint32_t ep_id, sid, result = 0;
uint64_t nic_id;
uint64_t nic_id_from_hostname;
mx_return_t ret;
mx_endpoint_t ep;
mx_endpoint_addr_t epa;
mx_request_t request;
mx_status_t status;
mx_endpoint_t ep2;
mx_init();
ret = mx_open_endpoint(0, 0, 0, NULL, 0, &ep); /* use the first
NIC, open endpoint 0, filter = 0, no params */
if (ret) {
printf("open_endpoint() returned %s\n", mx_strerror(ret));
exit(1);
}
ret = mx_get_endpoint_addr(ep, &epa);
if (ret) {
printf("get_endpoint_addr() returned %s\n", mx_strerror(ret));
exit(1);
}
ret = mx_decompose_endpoint_addr2(epa, &nic_id, &ep_id, &sid);
if (ret) {
printf("mx_decompose_endpoint_addr2() returned %s\n",
mx_strerror(ret));
exit(1);
}
ret = mx_iconnect(ep, nic_id, 0, 0, 0, NULL, &request);
if (ret) {
printf("iconnect() returned %s\n", mx_strerror(ret));
exit(1);
}
do {
ret = mx_test(ep, &request, &status, &result);
if (result)
printf("iconnect completed with status %s\n",
mx_strstatus(status.code));
} while (!result);
/* now test connecting to self by hostname */
ret = mx_hostname_to_nic_id("begbie:0", &nic_id_from_hostname);
ret = mx_iconnect(ep, nic_id_from_hostname, 0, 0, 0, NULL,
&request);
if (ret) {
printf("iconnect() returned %s\n", mx_strerror(ret));
exit(1);
}
do {
ret = mx_test(ep, &request, &status, &result);
if (result)
printf("iconnect to nic_id from hostname completed with status
%s\n", mx_strstatus(status.code));
} while (!result);
/* finally, try opening a second endpoint and connect to the first
from the new one */
ret = mx_open_endpoint(0, 1, 0, NULL, 0, &ep2); /* open a second
endpoint using the first NIC, endpoint 1, filter = 0, no params */
if (ret) {
printf("open_endpoint() returned %s\n", mx_strerror(ret));
exit(1);
}
ret = mx_iconnect(ep2, nic_id_from_hostname, 0, 0, 0, NULL,
&request); /* connect to first endpoint with second */
if (ret) {
printf("iconnect() returned %s\n", mx_strerror(ret));
exit(1);
}
do {
ret = mx_test(ep2, &request, &status, &result);
if (result)
printf("iconnect between endpoints completed with status %s
\n", mx_strstatus(status.code));
} while (!result);
mx_finalize();
return 0;
}
This results in:
jrandall@begbie:/tmp$ OMX_CONNECT_POLLALL=1 ./omxtestself2
OMX: Forcing connect polling all endpoints to enabled
iconnect completed with status Success
iconnect to nic_id from hostname completed with status Success
(it does not exit, but seems to loop forever in the final mx_test loop)
I'm not sure that I'm doing this right, and have actually also
implemented it as two separate test programs, one for each endpoint
and that has exactly the same issue. I'll pass this test case along
to the Open-MX people and see if they can point out what is wrong.
Thanks!
Josh.
On 23 May 2011, at 16:51, Atchley, Scott wrote:
On May 19, 2011, at 5:00 PM, Joshua Randall wrote:
Phil,
Yes, I have done these tests with MM_IMM_ACK="1" and
OMX_FATAL_ERRORS="0" set for both server and client.
Since my last message I have tried setting (in mx.h):
#define BMX_DB_MASK (BMX_DB_ALL)
And running pvfs2-ping with PVFS2_DEBUGMASK="all" on both the remote
client and the with client on the same host as the server. Comparing
the two outputs, I see the first lines that differ are:
client-server same host:
[D 20:35:21.445292] bmi_mx: entering bmx_connection_handlers.
[D 20:35:21.445308] bmi_mx: exiting bmx_connection_handlers.
client-server different hosts:
[D 20:42:13.754440] bmi_mx: entering bmx_connection_handlers.
[D 20:42:13.754463] bmi_mx: bmx_handle_icon_req returned for mx://
renton:0:3 with Success.
[D 20:42:13.754480] bmi_mx: bmx_handle_icon_req tx match=
0xc000000100000100 length= 0.
[D 20:42:13.754505] bmi_mx: bmx_handle_conn_req returned TX match
0xc000000100000100 with Success.
[D 20:42:13.754515] bmi_mx: CONN_REQ sent to mx://renton:0:3.
[D 20:42:13.754522] bmi_mx: entering bmx_peer_decref.
[D 20:42:13.754530] bmi_mx: exiting bmx_peer_decref.
[D 20:42:13.754537] bmi_mx: entering bmx_ctx_init.
[D 20:42:13.754545] bmi_mx: exiting bmx_ctx_init.
[D 20:42:13.754552] bmi_mx: exiting bmx_connection_handlers.
The process called mx_iconnect() to the peer. It then completed
which returns the peer's MX endpoint address. The process then packs
a CONN_REQUEST message and sends it to the peer using mx_isend() and
the new MX endpoint address.
In the first case, the mx_iconnect() to itself does not seem to
complete. It may be an issue in Open-MX. A simple test would be to
try to compile and run on Open-MX:
#include <stdio.h>
#include <inttypes.h>
#include "myriexpress.h"
int main(int argc, char *argv[])
{
uint32_t ep_id, sid, result = 0;
uint64_t nic_id;
mx_return_t ret;
mx_endpoint_t ep;
mx_endpoint_addr_t epa;
mx_request_t request;
mx_init();
ret = mx_open_endpoint(0, 0, 0, NULL, 0, &ep); /* use the first
NIC, open endpoint 0, filter = 0, no params */
if (ret) {
printf("open_endpoint() returned %s\n", mx_strerr(ret));
exit(1);
}
ret = mx_get_endpoint_addr(ep, &epa);
if (ret) {
printf("get_endpoint_addr() returned %s\n", mx_strerr(ret));
exit(1);
}
ret = mx_decompose_endpoint_addr2(epa, &nic_id, &ep_id, &sid);
if (ret) {
printf("mx_decompose_endpoint_addr2() returned %s\n",
mx_strerr(ret));
exit(1);
}
ret = mx_iconnect(ep, nic_id, 0, 0, 0, NULL, &request);
if (ret) {
printf("iconnect() returned %s\n", mx_strerr(ret));
exit(1);
}
do {
ret = mx_test(ep, &request, &status, &result);
if (result)
printf("iconnect completed with status is %s\n",
mx_strstatus(status.code));
} while (!result);
mx_fini();
return 0;
}
then, later on, with the client-server on different hosts, it gets:
[D 20:42:13.780057] bmi_mx: CONN_ACK from mx://renton:0:3 id= 3.
[D 20:42:13.780070] bmi_mx: setting mx://renton:0:3's state to
READY.
...
With the client-server on the same host, there are no CONN_REQ or
CONN_ACK messages.
I've run pvfs2-ping in gdb with a breakpoint for
bmx_connection_handlers and I see that when it goes into
bmx_handle_icon_req() on the same host as the server, the call to
"mx_test_any(bmi_mx->bmx_ep, match, mask, &status, &result);" simply
returns a 0 in result:
client-server same host:
2560 bmx_handle_icon_req();
(gdb) step
bmx_handle_icon_req () at src/io/bmi/bmi_mx/mx.c:2218
2218 uint32_t result = 0;
(gdb) step
2221 uint64_t match = (uint64_t)
BMX_MSG_ICON_REQ << BMX_MSG_SHIFT;
(gdb) step
2222 uint64_t mask = BMX_MASK_MSG;
(gdb) step
2225 mx_test_any(bmi_mx->bmx_ep, match, mask,
&status, &result);
(gdb) step
2226 if (result) {
(gdb) print result
$1 = 0
(gdb) print status
$2 = {code = 540697956, source = {stuff = {4210425200352911656,
6072343580357116704}}, match_info = 4833952, msg_length = 7084832,
xfer_length = 0, context = 0x6c1b60}
client-server different hosts:
2560 bmx_handle_icon_req();
(gdb) step
bmx_handle_icon_req () at src/io/bmi/bmi_mx/mx.c:2218
2218 uint32_t result = 0;
(gdb) step
2234 debug(BMX_DB_CONN, "%s returned for
%s with %s", __func__,
(gdb) step
2225 mx_test_any(bmi_mx->bmx_ep, match, mask,
&status, &result);
(gdb) step
2226 if (result) {
(gdb) print result
$2 = 1
(gdb) print status
$3 = {code = MX_STATUS_SUCCESS, source = {stuff = {8235576,
6207088827828273152}}, match_info = 12682136550675316736, msg_length
= 0, xfer_length = 0, context = 0x6bc910}
I traced this into the open-mx library, using the debug library, and
stepping through omx__test_any_common(ep, match_info, match_mask,
status) on the client running on each of the configurations (same or
different host as the server).
They do basically the same thing until libopen-mx/omx_test.c:276
where
there is a test that req match_info matches match_info:
if (likely((req->generic.status.match_info & match_mask) ==
match_info)) {
with client-server on the same host, it is false:
(gdb) print match_info
$6 = 12682136550675316736
(gdb) print (req->generic.status.match_info & match_mask)
$7 = 0
(gdb) print req->generic.status.match_info
$8 = 8395144
(gdb) print ((req->generic.status.match_info & match_mask) ==
match_info)
$9 = 0
however, with client-server on different hosts, it is true:
(gdb) print match_info
$5 = 12682136550675316736
(gdb) print (req->generic.status.match_info & match_mask)
$6 = 12682136550675316736
(gdb) print req->generic.status.match_info
$7 = 12682136550675316736
(gdb) print ((req->generic.status.match_info & match_mask) ==
match_info)
$8 = 1
I don't know either bmi_mx or open-mx well enough to have much of an
idea of what is going on here.
Josh.
Thanks, Josh.
bmi_mx does not handle self-communication any differently. It tries
to connect to itself using the normal code path for connecting to
others.
My guess from the above is that Open-MX is not setting/saving the
match_info for self-communications. It may not be setting other
fields as well but Brice's team will need to look into that.
Scott
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users