George --

All 4 tests fail for me -- can you have a look?

-----
[6:50] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ mpirun --mca btl tcp,sm,self --host 
mpi001,mpi002,mpi003,mpi004 -np 5 --bynode distgraph_test_1
[mpi002:5304] *** An error occurred in MPI_Dist_graph_create
[mpi002:5304] *** reported by process [46910457249793,46909632806913]
[mpi002:5304] *** on communicator MPI_COMM_WORLD
[mpi002:5304] *** MPI_ERR_OTHER: known error not in list
[mpi002:5304] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
abort,
[mpi002:5304] ***    and potentially your MPI job)
[savbu-usnic-a:24610] 4 more processes have sent help message 
help-mpi-errors.txt / mpi_errors_are_fatal
[savbu-usnic-a:24610] Set MCA parameter "orte_base_help_aggregate" to 0 to see 
all help / error messages
[6:50] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ mpirun --mca btl tcp,sm,self --host 
mpi001,mpi002,mpi003,mpi004 -np 5 --bynode distgraph_test_2
[mpi002:5316] *** An error occurred in MPI_Dist_graph_create_adjacent
[mpi002:5316] *** reported by process [46910457053185,46909632806913]
[mpi002:5316] *** on communicator MPI_COMM_WORLD
[mpi002:5316] *** MPI_ERR_OTHER: known error not in list
[mpi002:5316] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
abort,
[mpi002:5316] ***    and potentially your MPI job)
[savbu-usnic-a:24615] 4 more processes have sent help message 
help-mpi-errors.txt / mpi_errors_are_fatal
[savbu-usnic-a:24615] Set MCA parameter "orte_base_help_aggregate" to 0 to see 
all help / error messages
[6:51] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ mpirun --mca btl tcp,sm,self --host 
mpi001,mpi002,mpi003,mpi004 -np 5 --bynode distgraph_test_3
[mpi001:5338] *** An error occurred in MPI_Dist_graph_create_adjacent
[mpi001:5338] *** reported by process [46910469242881,46909632806916]
[mpi001:5338] *** on communicator MPI_COMM_WORLD
[mpi001:5338] *** MPI_ERR_OTHER: known error not in list
[mpi001:5338] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
abort,
[mpi001:5338] ***    and potentially your MPI job)
[savbu-usnic-a:24797] 4 more processes have sent help message 
help-mpi-errors.txt / mpi_errors_are_fatal
[savbu-usnic-a:24797] Set MCA parameter "orte_base_help_aggregate" to 0 to see 
all help / error messages
[6:51] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ mpirun --mca btl tcp,sm,self --host 
mpi001,mpi002,mpi003,mpi004 -np 5 --bynode distgraph_test_4
[mpi001:5351] *** An error occurred in MPI_Dist_graph_create
[mpi001:5351] *** reported by process [46910442110977,46909632806912]
[mpi001:5351] *** on communicator MPI_COMM_WORLD
[mpi001:5351] *** MPI_ERR_OTHER: known error not in list
[mpi001:5351] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
abort,
[mpi001:5351] ***    and potentially your MPI job)
[savbu-usnic-a:24891] 4 more processes have sent help message 
help-mpi-errors.txt / mpi_errors_are_fatal
[savbu-usnic-a:24891] Set MCA parameter "orte_base_help_aggregate" to 0 to see 
all help / error messages
[6:51] savbu-usnic-a:~/s/o/dist_graph ❯❯❯ 
-----



On Jul 1, 2013, at 8:41 AM, George Bosilca <bosi...@icl.utk.edu> wrote:

> The patch has been pushed into the trunk in r28687.
> 
>  George.
> 
> 
> On Jul 1, 2013, at 13:55 , George Bosilca <bosi...@icl.utk.edu> wrote:
> 
>> Guys,
>> 
>> Thanks for the patch and for the tests. All these changes/cleanups are 
>> correct, I have incorporate them all in the patch. Please find below the new 
>> patch.
>> 
>> As the deadline for the RFC is today, I'll move forward and push the changes 
>> into the trunk, and if there are still issues we can work them out directly 
>> in the trunk.
>> 
>> Thanks,
>> George.
>> 
>> PS: I will push your tests in our tests base as well.
>> 
>> 
>> On Jul 1, 2013, at 06:39 , "Kawashima, Takahiro" 
>> <t-kawash...@jp.fujitsu.com> wrote:
>> 
>>> George,
>>> 
>>> My colleague was working on your ompi-topo bitbucket repository
>>> but it was not completed. But he found bugs in your patch attached
>>> in your previous mail and created the fixing patch. See the attached
>>> patch, which is a patch against Open MPI trunk + your patch.
>>> 
>>> His test programs are also attached. test_1 and test_2 can run
>>> with nprocs=5, and test_3 and test_4 can run with nprocs>=3.
>>> 
>>> Though I'm not sure about the contents of the patch and the test
>>> programs, I can ask him if you have any questions.
>>> 
>>> Regards,
>>> Takahiro Kawashima,
>>> MPI development team,
>>> Fujitsu
>>> 
>>>> WHAT:    Support for MPI 2.2 dist_graph
>>>> 
>>>> WHY:     To become [almost entierly] MPI 2.2 compliant
>>>> 
>>>> WHEN:    Monday July 1st
>>>> 
>>>> As discussed during the last phone call, a missing functionality of the 
>>>> MPI 2.2 standard (the distributed graph topology) is ready for prime-time. 
>>>> The attached patch provide a minimal version (no components supporting 
>>>> reordering), that will complete the topology support in Open MPI.
>>>> 
>>>> It is somehow a major change compared with what we had before and it 
>>>> reshape the way we deal with topologies completely. Where our topologies 
>>>> were mainly storage components (they were not capable of creating the new 
>>>> communicator as an example), the new version is built around a [possibly] 
>>>> common representation (in mca/topo/topo.h), but the functions to attach 
>>>> and retrieve the topological information are specific to each component. 
>>>> As a result the ompi_create_cart and ompi_create_graph functions become 
>>>> useless and have been removed.
>>>> 
>>>> In addition to adding the internal infrastructure to manage the topology 
>>>> information, it updates the MPI interface, and the debuggers support and 
>>>> provides all Fortran interfaces. From a correctness point of view it 
>>>> passes all the tests we have in ompi-tests for the cart and graph 
>>>> topology, and some tests/applications for the dist_graph interface.
>>>> 
>>>> I don't think there is a need for a long wait on this one so I would like 
>>>> to propose a short deadline, a week from now on Monday July 1st. A patch 
>>>> based on Open MPI trunk r28670 is attached below.
>>> <dist-graph-fix.patch><dist-graph-test.tar.gz>_______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to