Ralph, Sorry for the late reply -- I was away on vacation.
regarding your earlier question about how many processes where involved when the memory was entirely allocated, it was only two, a sender and a receiver. I'm still trying to pinpoint what can be different between the standalone case and the "integrated" case. I will try to find out what part of the code is allocating memory in a loop. On Tue, Jul 20, 2010 at 12:51 AM, Ralph Castain <r...@open-mpi.org> wrote: > Well, I finally managed to make this work without the required ompi-server > rendezvous point. The fix is only in the devel trunk right now - I'll have to > ask the release managers for 1.5 and 1.4 if they want it ported to those > series. > great -- i'll give it a try > On the notion of integrating OMPI to your launch environment: remember that > we don't necessarily require that you use mpiexec for that purpose. If your > launch environment provides just a little info in the environment of the > launched procs, we can usually devise a method that allows the procs to > perform an MPI_Init as a single job without all this work you are doing. > I'm working on creating operators using MPI for the IBM product "InfoSphere Streams". It has its own launching mechanism to start the processes. However I can pass some information to the processes that belong to the same job (Streams job -- which should neatly map to MPI job). > Only difference is that your procs will all block in MPI_Init until they > -all- have executed that function. If that isn't a problem, this would be a > much more scalable and reliable method than doing it thru massive calls to > MPI_Port_connect. > in the general case, that would be a problem, but for my prototype, this is acceptable. In general, each process is composed of operators, some may be MPI related and some may not. But in my case, I know ahead of time which processes will be part of the MPI job, so I can easily deal with the fact that they would block on MPI_init (actually -- MPI_thread_init since its using a lot of threads). Is there a documentation or example I can use to see what information I can pass to the processes to enable that? Is it just environment variables? Many thanks! p. > > On Jul 18, 2010, at 4:09 PM, Philippe wrote: > >> Ralph, >> >> thanks for investigating. >> >> I've applied the two patches you mentioned earlier and ran with the >> ompi server. Although i was able to runn our standalone test, when I >> integrated the changes to our code, the processes entered a crazy loop >> and allocated all the memory available when calling MPI_Port_Connect. >> I was not able to identify why it works standalone but not integrated >> with our code. If I found why, I'll let your know. >> >> looking forward to your findings. We'll be happy to test any patches >> if you have some! >> >> p. >> >> On Sat, Jul 17, 2010 at 9:47 PM, Ralph Castain <r...@open-mpi.org> wrote: >>> Okay, I can reproduce this problem. Frankly, I don't think this ever worked >>> with OMPI, and I'm not sure how the choice of BTL makes a difference. >>> >>> The program is crashing in the communicator definition, which involves a >>> communication over our internal out-of-band messaging system. That system >>> has zero connection to any BTL, so it should crash either way. >>> >>> Regardless, I will play with this a little as time allows. Thanks for the >>> reproducer! >>> >>> >>> On Jun 25, 2010, at 7:23 AM, Philippe wrote: >>> >>>> Hi, >>>> >>>> I'm trying to run a test program which consists of a server creating a >>>> port using MPI_Open_port and N clients using MPI_Comm_connect to >>>> connect to the server. >>>> >>>> I'm able to do so with 1 server and 2 clients, but with 1 server + 3 >>>> clients, I get the following error message: >>>> >>>> [node003:32274] [[37084,0],0]:route_callback tried routing message >>>> from [[37084,1],0] to [[40912,1],0]:102, can't find route >>>> >>>> This is only happening with the openib BTL. With tcp BTL it works >>>> perfectly fine (ofud also works as a matter of fact...). This has been >>>> tested on two completely different clusters, with identical results. >>>> In either cases, the IB frabic works normally. >>>> >>>> Any help would be greatly appreciated! Several people in my team >>>> looked at the problem. Google and the mailing list archive did not >>>> provide any clue. I believe that from an MPI standpoint, my test >>>> program is valid (and it works with TCP, which make me feel better >>>> about the sequence of MPI calls) >>>> >>>> Regards, >>>> Philippe. >>>> >>>> >>>> >>>> Background: >>>> >>>> I intend to use openMPI to transport data inside a much larger >>>> application. Because of that, I cannot used mpiexec. Each process is >>>> started by our own "job management" and use a name server to find >>>> about each others. Once all the clients are connected, I would like >>>> the server to do MPI_Recv to get the data from all the client. I dont >>>> care about the order or which client are sending data, as long as I >>>> can receive it with on call. Do do that, the clients and the server >>>> are going through a series of Comm_accept/Conn_connect/Intercomm_merge >>>> so that at the end, all the clients and the server are inside the same >>>> intracomm. >>>> >>>> Steps: >>>> >>>> I have a sample program that show the issue. I tried to make it as >>>> short as possible. It needs to be executed on a shared file system >>>> like NFS because the server write the port info to a file that the >>>> client will read. To reproduce the issue, the following steps should >>>> be performed: >>>> >>>> 0. compile the test with "mpicc -o ben12 ben12.c" >>>> 1. ssh to the machine that will be the server >>>> 2. run ./ben12 3 1 >>>> 3. ssh to the machine that will be the client #1 >>>> 4. run ./ben12 3 0 >>>> 5. repeat step 3-4 for client #2 and #3 >>>> >>>> the server accept the connection from client #1 and merge it in a new >>>> intracomm. It then accept connection from client #2 and merge it. when >>>> the client #3 arrives, the server accept the connection, but that >>>> cause client #1 and #2 to die with the error above (see the complete >>>> trace in the tarball). >>>> >>>> The exact steps are: >>>> >>>> - server open port >>>> - server does accept >>>> - client #1 does connect >>>> - server and client #1 do merge >>>> - server does accept >>>> - client #2 does connect >>>> - server, client #1 and client #2 do merge >>>> - server does accept >>>> - client #3 does connect >>>> - server, client #1, client #2 and client #3 do merge >>>> >>>> >>>> My infiniband network works normally with other test programs or >>>> applications (MPI or others like Verbs). >>>> >>>> Info about my setup: >>>> >>>> openMPI version = 1.4.1 (I also tried 1.4.2, nightly snapshot of >>>> 1.4.3, nightly snapshot of 1.5 --- all show the same error) >>>> config.log in the tarball >>>> "ompi_info --all" in the tarball >>>> OFED version = 1.3 installed from RHEL 5.3 >>>> Distro = RedHat Entreprise Linux 5.3 >>>> Kernel = 2.6.18-128.4.1.el5 x86_64 >>>> subnet manager = built-in SM from the cisco/topspin switch >>>> output of ibv_devinfo included in the tarball (there are no "bad" nodes) >>>> "ulimit -l" says "unlimited" >>>> >>>> The tarball contains: >>>> >>>> - ben12.c: my test program showing the behavior >>>> - config.log / config.out / make.out / make-install.out / >>>> ifconfig.txt / ibv-devinfo.txt / ompi_info.txt >>>> - trace-tcp.txt: output of the server and each client when it works >>>> with TCP (I added "btl = tcp,self" in ~/.openmpi/mca-params.conf) >>>> - trace-ib.txt: output of the server and each client when it fails >>>> with IB (I added "btl = openib,self" in ~/.openmpi/mca-params.conf) >>>> >>>> I hope I provided enough info for somebody to reproduce the problem... >>>> <ompi-output.tar.bz2>_______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >