Kind of sounds to me like they are using the wrong proc when receiving. Here is an example of what a modex receive should look like:https://github.com/open-mpi/ompi/blob/main/opal/mca/btl/ugni/btl_ugni_endpoint.c#L44-NathanOn Aug 3, 2022, at 11:29 AM, 
"Jeff Squyres (jsquyres) via devel" <devel@lists.open-mpi.org> wrote:Glad you solved the first issue!With respect to debugging, if you don't have a parallel debugger, you can do something like this: 
https://www.open-mpi.org/faq/?category=debugging#serial-debuggersIf you haven't done so already, I highly suggest configuring Open MPI with "CFLAGS=-g -O0".As for the modex, it does actually use TCP under the covers, but that shouldn't matter 
to you: the main point is that the BTL is not used for exchanging modex information.  Hence, whatever your BTL module puts into the modex and gets out of the modex should happen asynchronously without involving the BTL.--Jeff 
Squyresjsquyres@cisco.com________________________________________From: devel <devel-boun...@lists.open-mpi.org> on behalf of Michele Martinelli via devel <devel@lists.open-mpi.org>Sent: Wednesday, August 3, 2022 12:49 PMTo: 
de...@lists.open-mpi.orgCc: Michele MartinelliSubject: Re: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC under developmentthank you for the answer. Actually I think I solved that problem somedays ago, basically (if I correctly 
understand) MPI "adds" in some sensean header to the data sent (please correct me if I'm wrong), which isthen used by ob1 to match the data arrived with the mpi_recv posted bythe user. The problem was then a poorly reconstructed header on 
thereceiving side.unfortunately my happiness didn't last long because I have already foundanother problem: it seems that the peers are not actually exchanging thecorrect information via the modex protocol (not sure which kind ofnetwork connection they 
are using in that phase), receiving "local" datainstead of the remote ones, but I just started debugging this, maybe Icould open a new thread specific on this.MicheleIl 03/08/22 15:43, Jeff Squyres (jsquyres) ha scritto:Sorry for the huge 
delay in replies -- it's summer / vacation season, and I think we (as a community) are a little behind in answering some of these emails.  :-(It's been quite a while since I have been in the depths of BTL internals; I'm afraid I don't remember the 
details offhand.When I was writing the usnic BTL, I know I found it useful to attach a debugger on the sending and/or receiving side processes, and actually step through both my BTL code and the OB1 PML code to see what was happening.  I frequently 
found that either my BTL wasn't correctly accounting for network conditions, or it wasn't passing information up to OB1 that it expected (e.g., it passed the wrong length, or the wrong ID number, or ...something else).  You can actually follow what 
happens in OB1 when your BTL invokes the cbfunc -- does it find a corresponding MPI_Request, and does it mark it complete?  Or does it put your incoming fragment as an unexpected message for some reason, and put it on the unexpected queue?  Look for 
that kind of stuff.--Jeff Squyresjsquyres@cisco.com________________________________________From: devel <devel-boun...@lists.open-mpi.org> on behalf of Michele Martinelli via devel <devel@lists.open-mpi.org>Sent: Saturday, July 23, 2022 9:04 
AMTo: de...@lists.open-mpi.orgCc: Michele MartinelliSubject: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC under developmentHi,I'm trying to develop a btl for a custom NIC. I studied the btl.h fileto understand the flow of calls that 
are expected to be implemented inmy component. I'm using a simple test (which works like a charm with theTCP btl) to test my development, the code is a simple MPI_Send + MPI_Recv: MPI_Init(NULL, NULL); int world_rank; MPI_Comm_rank(MPI_COMM_WORLD, 
&world_rank); int world_size; MPI_Comm_size(MPI_COMM_WORLD, &world_size); int ping_pong_count = 1; int partner_rank = (world_rank + 1) % 2; printf("MY RANK: %d PARTNER: %d\n",world_rank,partner_rank); if (world_rank == 0) { 
ping_pong_count++; MPI_Send(&ping_pong_count, 1, MPI_INT, partner_rank, 0,MPI_COMM_WORLD); printf("%d sent and incremented ping_pong_count %d to %d\n",world_rank, ping_pong_count, partner_rank); } else { MPI_Recv(&ping_pong_count, 1, 
MPI_INT, partner_rank, 0,MPI_COMM_WORLD, MPI_STATUS_IGNORE); printf("%d received ping_pong_count %d from %d\n", world_rank, ping_pong_count, partner_rank); } MPI_Finalize();I see that in my component's btl code the functions called during 
the"MPI_send" phase are: 1. mca_btl_mycomp_add_procs 2. mca_btl_mycomp_prepare_src 3. mca_btl_mycomp_send (where I set the return to 1, so the send phase should be finished)I see then the print inside the test: 0 sent and incremented 
ping_pong_count 2 to 1and this should conclude the MPI_Send phase.Then I implemented in the btl_mycomp_component_progress function a call to: mca_btl_active_message_callback_t *reg =mca_btl_base_active_message_trigger + tag; 
reg->cbfunc(&my_btl->super, &desc);I saw the same code in all the other BTLs and I thought this was enoughto "unlock" the MPI_Recv "polling". But actually I see my test hangs,probably "waiting" for something 
that never happens (?).I also took a look in the ob1 mca_pml_ob1_recv_frag_callback_matchfunction (which I suppose to be the reg->cbfunc), and it seems to get tothe end of the function, actually matching my frag.So my question is: how can I say to 
the framework that I finished mywork and so the function can return to the user application? What am Idoing wrong?Is there a way to understand where and what my code is waiting for?Best

Reply via email to