As all examples are working perfectly in my version of the code I was puzzled by Jeff's issue. It turns out it's a side effect of trying to push as few items as possible instead of just pushing everything in the trunk. I'll fix it in few minutes, meanwhile I'll drop few words about what was the issue.
One might have notice that this framework came without any component. The reason is that all the components in development are still in "paper in progress" stage, and thus not being pushed in the trunk. However, the level of functionality required by the MPI 2.2 standard is provided by the functions in the base, so it will work reasonably well as is. However, it need a "module", otherwise the functions in the base done't have a placeholder to attach to. Thus it is crucial to have a decoy component, one that can provide the empty module to have the base functions copied over. So the problem Jeff noticed was the lack of a basic component in the topo framework. George. On Jul 1, 2013, at 15:51 , "Jeff Squyres (esquires)" <jsquy...@cisco.com> wrote: > George -- > > All 4 tests fail for me -- can you have a look? > > ----- > [6:50] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ mpirun --mca btl tcp,sm,self --host > mpi001,mpi002,mpi003,mpi004 -np 5 --bynode distgraph_test_1 > [mpi002:5304] *** An error occurred in MPI_Dist_graph_create > [mpi002:5304] *** reported by process [46910457249793,46909632806913] > [mpi002:5304] *** on communicator MPI_COMM_WORLD > [mpi002:5304] *** MPI_ERR_OTHER: known error not in list > [mpi002:5304] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will > now abort, > [mpi002:5304] *** and potentially your MPI job) > [savbu-usnic-a:24610] 4 more processes have sent help message > help-mpi-errors.txt / mpi_errors_are_fatal > [savbu-usnic-a:24610] Set MCA parameter "orte_base_help_aggregate" to 0 to > see all help / error messages > [6:50] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ mpirun --mca btl tcp,sm,self --host > mpi001,mpi002,mpi003,mpi004 -np 5 --bynode distgraph_test_2 > [mpi002:5316] *** An error occurred in MPI_Dist_graph_create_adjacent > [mpi002:5316] *** reported by process [46910457053185,46909632806913] > [mpi002:5316] *** on communicator MPI_COMM_WORLD > [mpi002:5316] *** MPI_ERR_OTHER: known error not in list > [mpi002:5316] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will > now abort, > [mpi002:5316] *** and potentially your MPI job) > [savbu-usnic-a:24615] 4 more processes have sent help message > help-mpi-errors.txt / mpi_errors_are_fatal > [savbu-usnic-a:24615] Set MCA parameter "orte_base_help_aggregate" to 0 to > see all help / error messages > [6:51] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ mpirun --mca btl tcp,sm,self --host > mpi001,mpi002,mpi003,mpi004 -np 5 --bynode distgraph_test_3 > [mpi001:5338] *** An error occurred in MPI_Dist_graph_create_adjacent > [mpi001:5338] *** reported by process [46910469242881,46909632806916] > [mpi001:5338] *** on communicator MPI_COMM_WORLD > [mpi001:5338] *** MPI_ERR_OTHER: known error not in list > [mpi001:5338] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will > now abort, > [mpi001:5338] *** and potentially your MPI job) > [savbu-usnic-a:24797] 4 more processes have sent help message > help-mpi-errors.txt / mpi_errors_are_fatal > [savbu-usnic-a:24797] Set MCA parameter "orte_base_help_aggregate" to 0 to > see all help / error messages > [6:51] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ mpirun --mca btl tcp,sm,self --host > mpi001,mpi002,mpi003,mpi004 -np 5 --bynode distgraph_test_4 > [mpi001:5351] *** An error occurred in MPI_Dist_graph_create > [mpi001:5351] *** reported by process [46910442110977,46909632806912] > [mpi001:5351] *** on communicator MPI_COMM_WORLD > [mpi001:5351] *** MPI_ERR_OTHER: known error not in list > [mpi001:5351] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will > now abort, > [mpi001:5351] *** and potentially your MPI job) > [savbu-usnic-a:24891] 4 more processes have sent help message > help-mpi-errors.txt / mpi_errors_are_fatal > [savbu-usnic-a:24891] Set MCA parameter "orte_base_help_aggregate" to 0 to > see all help / error messages > [6:51] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ > ----- > > > > On Jul 1, 2013, at 8:41 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > >> The patch has been pushed into the trunk in r28687. >> >> George. >> >> >> On Jul 1, 2013, at 13:55 , George Bosilca <bosi...@icl.utk.edu> wrote: >> >>> Guys, >>> >>> Thanks for the patch and for the tests. All these changes/cleanups are >>> correct, I have incorporate them all in the patch. Please find below the >>> new patch. >>> >>> As the deadline for the RFC is today, I'll move forward and push the >>> changes into the trunk, and if there are still issues we can work them out >>> directly in the trunk. >>> >>> Thanks, >>> George. >>> >>> PS: I will push your tests in our tests base as well. >>> >>> >>> On Jul 1, 2013, at 06:39 , "Kawashima, Takahiro" >>> <t-kawash...@jp.fujitsu.com> wrote: >>> >>>> George, >>>> >>>> My colleague was working on your ompi-topo bitbucket repository >>>> but it was not completed. But he found bugs in your patch attached >>>> in your previous mail and created the fixing patch. See the attached >>>> patch, which is a patch against Open MPI trunk + your patch. >>>> >>>> His test programs are also attached. test_1 and test_2 can run >>>> with nprocs=5, and test_3 and test_4 can run with nprocs>=3. >>>> >>>> Though I'm not sure about the contents of the patch and the test >>>> programs, I can ask him if you have any questions. >>>> >>>> Regards, >>>> Takahiro Kawashima, >>>> MPI development team, >>>> Fujitsu >>>> >>>>> WHAT: Support for MPI 2.2 dist_graph >>>>> >>>>> WHY: To become [almost entierly] MPI 2.2 compliant >>>>> >>>>> WHEN: Monday July 1st >>>>> >>>>> As discussed during the last phone call, a missing functionality of the >>>>> MPI 2.2 standard (the distributed graph topology) is ready for >>>>> prime-time. The attached patch provide a minimal version (no components >>>>> supporting reordering), that will complete the topology support in Open >>>>> MPI. >>>>> >>>>> It is somehow a major change compared with what we had before and it >>>>> reshape the way we deal with topologies completely. Where our topologies >>>>> were mainly storage components (they were not capable of creating the new >>>>> communicator as an example), the new version is built around a [possibly] >>>>> common representation (in mca/topo/topo.h), but the functions to attach >>>>> and retrieve the topological information are specific to each component. >>>>> As a result the ompi_create_cart and ompi_create_graph functions become >>>>> useless and have been removed. >>>>> >>>>> In addition to adding the internal infrastructure to manage the topology >>>>> information, it updates the MPI interface, and the debuggers support and >>>>> provides all Fortran interfaces. From a correctness point of view it >>>>> passes all the tests we have in ompi-tests for the cart and graph >>>>> topology, and some tests/applications for the dist_graph interface. >>>>> >>>>> I don't think there is a need for a long wait on this one so I would like >>>>> to propose a short deadline, a week from now on Monday July 1st. A patch >>>>> based on Open MPI trunk r28670 is attached below. >>>> <dist-graph-fix.patch><dist-graph-test.tar.gz>_______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel