Re: [OMPI devel] Problem with openib on demand connection bring up.
The patch applies to ib_multifrag as is without a conflict. But the branch doesn't compile with or without the patch so I was not able to test it. Do you have some uncommitted changes that may generate a conflict? Can you commit them so they can be resolved? If there is no conflict between your work and this patch may be it is a good idea to commit it to your branch and trunk for testing? I have a whole pile of changes that need to be committed, and even with these changes, it still doesn't compile as I am reworking names, and data structures, etc. I will commit what I have now, and will work on this a bit more over the weekend. - Galen Thanks, Galen On Jun 13, 2007, at 7:27 AM, Gleb Natapov wrote: Hello everyone, I encountered a problem with openib on depend connection code. Basically it works only by pure luck if you have more then one endpoint for the same proc and sometimes breaks in mysterious ways. The algo works like this: A wants to connect to B so it creates QP and sends it to B. B receives the QP from A and looks for endpoint that is not yet associated with remote endpoint, creates QP for it and sends info back. Now A receives the QP and goes through the same logic as B i.e looks for endpoint that is not yet connected, BUT there is no guaranty that it will find the endpoint that initiated the connection in the first place! And if it finds another one it will create QP for it and will send it back to B and so on and so forth. In the end I sometimes receive a peculiar mesh of connection where no QP has a connection back to it from the peer process. To overcome this problem B needs to send back some info that will allow A to determine the endpoint that initiated a connection request. The lid:qp pair will allow for this. But even then the problem will remain if two procs initiate connection at the same time. To dial with simultaneous connection asymmetry protocol have to be used one peer became master another slave. Slave alway initiate a connection to master. Master choose local endpoint to satisfy incoming request and sends info back to a slave. If master wants to initiate a connection it send message to a slave and slave initiate connection back to master. Included patch implements an algorithm described above and work for all scenarios for which current code fails to create a connection. -- Gleb. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Gleb. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Gleb. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Jun 14, 2007, at 7:11 AM, Jeff Squyres wrote: Now I see that my fix was in the right place, but still a little bit wrong. I committed a fix to my fix in r15073. Can you check it? My cluster is still running MTT from last night; I'll need to wait for several jobs to finish. I'll check it later today. I got a test job to run in in the middle of other MTT runs. r15073 seems to have fixed the problem; thanks. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Jun 14, 2007, at 6:32 AM, Gleb Natapov wrote: 794:mca_btl_openib_endpoint_recv] can't find suitable endpoint for this peer Now I see that my fix was in the right place, but still a little bit wrong. I committed a fix to my fix in r15073. Can you check it? My cluster is still running MTT from last night; I'll need to wait for several jobs to finish. I'll check it later today. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Wed, Jun 13, 2007 at 07:08:51PM +0300, Gleb Natapov wrote: > On Wed, Jun 13, 2007 at 09:38:21AM -0600, Galen Shipman wrote: > > Hi Gleb, > > > > As we have discussed before I am working on adding support for > > multiple QPs with either per peer resources or shared resources. > > As a result of this I am trying to clean up a lot of the OpenIB code. > > It has grown up organically over the years and needs some attention. > > Perhaps we can coordinate on commits or even work from the same temp > > branch to do an overall cleanup as well as addressing the issue you > > describe in this email. > > > > I bring this up because this commit will conflict quite a bit with > > what I am working on, I can always merge it by hand but it may make > > sense for us to get this all done in one area and then bring it all > > over? > > I am not committing this yet. I want people to review my logic and the > patch. If the change is OK with everyone how cares then I want this > change to go into 1.2 branch. > > I don't care how this change will get to the trunk. I can use patched > version for a while. If you branch is in working state right now I can > merge this change into it tomorrow. The patch applies to ib_multifrag as is without a conflict. But the branch doesn't compile with or without the patch so I was not able to test it. Do you have some uncommitted changes that may generate a conflict? Can you commit them so they can be resolved? If there is no conflict between your work and this patch may be it is a good idea to commit it to your branch and trunk for testing? > > > > > Thanks, > > > > Galen > > > > > > On Jun 13, 2007, at 7:27 AM, Gleb Natapov wrote: > > > > > Hello everyone, > > > > > > I encountered a problem with openib on depend connection code. > > > Basically > > > it works only by pure luck if you have more then one endpoint for > > > the same > > > proc and sometimes breaks in mysterious ways. > > > > > > The algo works like this: A wants to connect to B so it creates QP > > > and sends it > > > to B. B receives the QP from A and looks for endpoint that is not > > > yet associated > > > with remote endpoint, creates QP for it and sends info back. Now A > > > receives > > > the QP and goes through the same logic as B i.e looks for endpoint > > > that is not > > > yet connected, BUT there is no guaranty that it will find the > > > endpoint that > > > initiated the connection in the first place! And if it finds > > > another one it will > > > create QP for it and will send it back to B and so on and so forth. > > > In the end > > > I sometimes receive a peculiar mesh of connection where no QP has a > > > connection > > > back to it from the peer process. > > > > > > To overcome this problem B needs to send back some info that will > > > allow A to > > > determine the endpoint that initiated a connection request. The > > > lid:qp pair > > > will allow for this. But even then the problem will remain if two > > > procs initiate > > > connection at the same time. To dial with simultaneous connection > > > asymmetry > > > protocol have to be used one peer became master another slave. > > > Slave alway > > > initiate a connection to master. Master choose local endpoint to > > > satisfy > > > incoming request and sends info back to a slave. If master wants to > > > initiate a > > > connection it send message to a slave and slave initiate connection > > > back to > > > master. > > > > > > Included patch implements an algorithm described above and work for > > > all > > > scenarios for which current code fails to create a connection. > > > > > > -- > > > Gleb. > > > > > > ___ > > > devel mailing list > > > de...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > -- > Gleb. > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Gleb.
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Jun 13, 2007, at 12:07 PM, Gleb Natapov wrote: On Wed, Jun 13, 2007 at 02:05:00PM -0400, Jeff Squyres wrote: On Jun 13, 2007, at 1:54 PM, Jeff Squyres wrote: With today's trunk, I still see the problem: Same thing happens on v1.2 branch. I'll re-open #548. I am sure it was never tested with multiple subnets. I'll try to get such setup. I tested this with multiple subnets but it was quite some time ago. - Galen -- Gleb. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Jun 13, 2007, at 1:54 PM, Jeff Squyres wrote: With today's trunk, I still see the problem: Same thing happens on v1.2 branch. I'll re-open #548. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Jun 13, 2007, at 11:33 AM, Jeff Squyres wrote: On Jun 13, 2007, at 1:15 PM, Nysal Jan wrote: There is a ticket (closed) here: https://svn.open-mpi.org/trac/ompi/ ticket/548 It was fixed by Galen for 1.2. Ah -- I forgot to look at closed tickets. I think we broke it again; it certainly fails on the trunk (perhaps related to what Gleb found?). I did not test 1.2. There is a FAQ entry also about this http://www.open-mpi.org/faq/? category=openfabrics#ofa-port-wireup That's what it *should* be doing, but I wonder if that's what it *actually* is doing. So it has been a while but we tested this on our local cluster with differing number of ports and it worked, but I was doing simple ping- pongs. If both sides try to open a connection at the same time however, badness can occur, from my understanding of this. - Galen -- Jeff Squyres Cisco Systems ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Wed, Jun 13, 2007 at 12:45:01PM -0400, Jeff Squyres wrote: > On Jun 13, 2007, at 12:08 PM, Gleb Natapov wrote: > > > I am not committing this yet. I want people to review my logic and the > > patch. If the change is OK with everyone how cares then I want this > > change to go into 1.2 branch. > > > > I don't care how this change will get to the trunk. I can use patched > > version for a while. If you branch is in working state right now I can > > merge this change into it tomorrow. > > I was just bitten yesterday by a problem that I've known about for a > while but had never gotten around to looking into (I could have sworn > that there was an open trac ticket on this, but I can't find one > anywhere). > > I have 2 hosts: one with 3 active ports and one with 2 active ports. > If I run an MPI job between them, the openib BTL wireup got badly and > it aborts. So handling a heterogeneous number of ports is not > currently handled properly in the code. Are the all in the same subnet? If not I fixed some bug yesterday that may help. > > I don't know if Gleb's patch addresses this situation or not; I'll > look at his patch this afternoon. > This patch address different problem. -- Gleb.
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Jun 13, 2007, at 1:15 PM, Nysal Jan wrote: There is a ticket (closed) here: https://svn.open-mpi.org/trac/ompi/ ticket/548 It was fixed by Galen for 1.2. Ah -- I forgot to look at closed tickets. I think we broke it again; it certainly fails on the trunk (perhaps related to what Gleb found?). I did not test 1.2. There is a FAQ entry also about this http://www.open-mpi.org/faq/? category=openfabrics#ofa-port-wireup That's what it *should* be doing, but I wonder if that's what it *actually* is doing. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Problem with openib on demand connection bring up.
I was just bitten yesterday by a problem that I've known about for a while but had never gotten around to looking into (I could have sworn that there was an open trac ticket on this, but I can't find one anywhere). I have 2 hosts: one with 3 active ports and one with 2 active ports. If I run an MPI job between them, the openib BTL wireup got badly and it aborts. So handling a heterogeneous number of ports is not currently handled properly in the code. I don't know if Gleb's patch addresses this situation or not; I'll look at his patch this afternoon. There is a ticket (closed) here: https://svn.open-mpi.org/trac/ompi/ticket/548 It was fixed by Galen for 1.2. There is a FAQ entry also about this http://www.open-mpi.org/faq/?category=openfabrics#ofa-port-wireup
Re: [OMPI devel] Problem with openib on demand connection bring up.
I wonder if this is bringing up the point that there are several of us working in the openib code base -- I wonder if it would be worthwhile to have a [short] teleconference to discuss what we're all doing in openib, where we're doing it (trunk, branch, whatever), when we expect to have it done, what version we need it in, etc. Just a coordination kind of teleconference. If people think this is a good idea, I can setup the call. For example, don't forget that Nysal and I have the openib btl port- selection stuff off in /tmp/jnysal-openib-wireup (the btl_openib_if_ [in|ex]clude MCA params). Per my prior e-mail, if no one objects, I will be bringing that stuff in to the trunk tomorrow evening (I'm pretty sure it won't conflict with what Galen is doing; Galen and I discussed on the phone this morning). On Jun 13, 2007, at 11:38 AM, Galen Shipman wrote: Hi Gleb, As we have discussed before I am working on adding support for multiple QPs with either per peer resources or shared resources. As a result of this I am trying to clean up a lot of the OpenIB code. It has grown up organically over the years and needs some attention. Perhaps we can coordinate on commits or even work from the same temp branch to do an overall cleanup as well as addressing the issue you describe in this email. I bring this up because this commit will conflict quite a bit with what I am working on, I can always merge it by hand but it may make sense for us to get this all done in one area and then bring it all over? Thanks, Galen On Jun 13, 2007, at 7:27 AM, Gleb Natapov wrote: Hello everyone, I encountered a problem with openib on depend connection code. Basically it works only by pure luck if you have more then one endpoint for the same proc and sometimes breaks in mysterious ways. The algo works like this: A wants to connect to B so it creates QP and sends it to B. B receives the QP from A and looks for endpoint that is not yet associated with remote endpoint, creates QP for it and sends info back. Now A receives the QP and goes through the same logic as B i.e looks for endpoint that is not yet connected, BUT there is no guaranty that it will find the endpoint that initiated the connection in the first place! And if it finds another one it will create QP for it and will send it back to B and so on and so forth. In the end I sometimes receive a peculiar mesh of connection where no QP has a connection back to it from the peer process. To overcome this problem B needs to send back some info that will allow A to determine the endpoint that initiated a connection request. The lid:qp pair will allow for this. But even then the problem will remain if two procs initiate connection at the same time. To dial with simultaneous connection asymmetry protocol have to be used one peer became master another slave. Slave alway initiate a connection to master. Master choose local endpoint to satisfy incoming request and sends info back to a slave. If master wants to initiate a connection it send message to a slave and slave initiate connection back to master. Included patch implements an algorithm described above and work for all scenarios for which current code fails to create a connection. -- Gleb. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Jun 13, 2007, at 12:08 PM, Gleb Natapov wrote: I am not committing this yet. I want people to review my logic and the patch. If the change is OK with everyone how cares then I want this change to go into 1.2 branch. I don't care how this change will get to the trunk. I can use patched version for a while. If you branch is in working state right now I can merge this change into it tomorrow. I was just bitten yesterday by a problem that I've known about for a while but had never gotten around to looking into (I could have sworn that there was an open trac ticket on this, but I can't find one anywhere). I have 2 hosts: one with 3 active ports and one with 2 active ports. If I run an MPI job between them, the openib BTL wireup got badly and it aborts. So handling a heterogeneous number of ports is not currently handled properly in the code. I don't know if Gleb's patch addresses this situation or not; I'll look at his patch this afternoon. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Wed, Jun 13, 2007 at 09:38:21AM -0600, Galen Shipman wrote: > Hi Gleb, > > As we have discussed before I am working on adding support for > multiple QPs with either per peer resources or shared resources. > As a result of this I am trying to clean up a lot of the OpenIB code. > It has grown up organically over the years and needs some attention. > Perhaps we can coordinate on commits or even work from the same temp > branch to do an overall cleanup as well as addressing the issue you > describe in this email. > > I bring this up because this commit will conflict quite a bit with > what I am working on, I can always merge it by hand but it may make > sense for us to get this all done in one area and then bring it all > over? I am not committing this yet. I want people to review my logic and the patch. If the change is OK with everyone how cares then I want this change to go into 1.2 branch. I don't care how this change will get to the trunk. I can use patched version for a while. If you branch is in working state right now I can merge this change into it tomorrow. > > Thanks, > > Galen > > > On Jun 13, 2007, at 7:27 AM, Gleb Natapov wrote: > > > Hello everyone, > > > > I encountered a problem with openib on depend connection code. > > Basically > > it works only by pure luck if you have more then one endpoint for > > the same > > proc and sometimes breaks in mysterious ways. > > > > The algo works like this: A wants to connect to B so it creates QP > > and sends it > > to B. B receives the QP from A and looks for endpoint that is not > > yet associated > > with remote endpoint, creates QP for it and sends info back. Now A > > receives > > the QP and goes through the same logic as B i.e looks for endpoint > > that is not > > yet connected, BUT there is no guaranty that it will find the > > endpoint that > > initiated the connection in the first place! And if it finds > > another one it will > > create QP for it and will send it back to B and so on and so forth. > > In the end > > I sometimes receive a peculiar mesh of connection where no QP has a > > connection > > back to it from the peer process. > > > > To overcome this problem B needs to send back some info that will > > allow A to > > determine the endpoint that initiated a connection request. The > > lid:qp pair > > will allow for this. But even then the problem will remain if two > > procs initiate > > connection at the same time. To dial with simultaneous connection > > asymmetry > > protocol have to be used one peer became master another slave. > > Slave alway > > initiate a connection to master. Master choose local endpoint to > > satisfy > > incoming request and sends info back to a slave. If master wants to > > initiate a > > connection it send message to a slave and slave initiate connection > > back to > > master. > > > > Included patch implements an algorithm described above and work for > > all > > scenarios for which current code fails to create a connection. > > > > -- > > Gleb. > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Gleb.
Re: [OMPI devel] Problem with openib on demand connection bring up.
On Jun 13, 2007, at 9:49 AM, Torsten Hoefler wrote: Hi Galen,Gleb, there is also something weird going on if I call the basic alltoall during the module_init() of a collective module (I need to wire up my own QPs in my coll component). It takes 7 seconds for 4 nodes and more than 30 minutes for 120 nodes. It seems to be an OpenIB wireup issue because if I start with -mca btl tcp,self this goes as fast as expected (<2 seconds). Will this issue be fixed with your patch? No, this is a separate issue. Try: -mca mpi_preconnect_oob 1 then try: -mca mpi_preconnect_all 1 and let us know what the times are. thx, galen Thanks, Torsten -- bash$ :(){ :|:&};: - http://www.unixer.de/ - Indiana University| http://www.indiana.edu Open Systems Lab | http://osl.iu.edu/ 150 S. Woodlawn Ave. | Bloomington, IN, 474045-7104 | USA Lindley Hall Room 135 | +01 (812) 855-3608 ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] Problem with openib on demand connection bring up.
Hi Galen,Gleb, there is also something weird going on if I call the basic alltoall during the module_init() of a collective module (I need to wire up my own QPs in my coll component). It takes 7 seconds for 4 nodes and more than 30 minutes for 120 nodes. It seems to be an OpenIB wireup issue because if I start with -mca btl tcp,self this goes as fast as expected (<2 seconds). Will this issue be fixed with your patch? Thanks, Torsten -- bash$ :(){ :|:&};: - http://www.unixer.de/ - Indiana University| http://www.indiana.edu Open Systems Lab | http://osl.iu.edu/ 150 S. Woodlawn Ave. | Bloomington, IN, 474045-7104 | USA Lindley Hall Room 135 | +01 (812) 855-3608
Re: [OMPI devel] Problem with openib on demand connection bring up.
Hi Gleb, As we have discussed before I am working on adding support for multiple QPs with either per peer resources or shared resources. As a result of this I am trying to clean up a lot of the OpenIB code. It has grown up organically over the years and needs some attention. Perhaps we can coordinate on commits or even work from the same temp branch to do an overall cleanup as well as addressing the issue you describe in this email. I bring this up because this commit will conflict quite a bit with what I am working on, I can always merge it by hand but it may make sense for us to get this all done in one area and then bring it all over? Thanks, Galen On Jun 13, 2007, at 7:27 AM, Gleb Natapov wrote: Hello everyone, I encountered a problem with openib on depend connection code. Basically it works only by pure luck if you have more then one endpoint for the same proc and sometimes breaks in mysterious ways. The algo works like this: A wants to connect to B so it creates QP and sends it to B. B receives the QP from A and looks for endpoint that is not yet associated with remote endpoint, creates QP for it and sends info back. Now A receives the QP and goes through the same logic as B i.e looks for endpoint that is not yet connected, BUT there is no guaranty that it will find the endpoint that initiated the connection in the first place! And if it finds another one it will create QP for it and will send it back to B and so on and so forth. In the end I sometimes receive a peculiar mesh of connection where no QP has a connection back to it from the peer process. To overcome this problem B needs to send back some info that will allow A to determine the endpoint that initiated a connection request. The lid:qp pair will allow for this. But even then the problem will remain if two procs initiate connection at the same time. To dial with simultaneous connection asymmetry protocol have to be used one peer became master another slave. Slave alway initiate a connection to master. Master choose local endpoint to satisfy incoming request and sends info back to a slave. If master wants to initiate a connection it send message to a slave and slave initiate connection back to master. Included patch implements an algorithm described above and work for all scenarios for which current code fails to create a connection. -- Gleb. ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel