Le 23/05/2015 14:10, Anton Gladky a écrit : > Hi Thibaut, > > for testing MPI I add the following line into the script: > > export OMPI_MCA_orte_rsh_agent=/bin/false > > Usually it works [1--3].
Dear Anton, Thanks for the tip. Unfortunately it does not work (tested only on jessie and on gyoto). Setting this variable does not change anything. Gyoto is a bit peculiar compared to most MPI codes in that it uses MPI_Comm_spawn to spawn workers instead of relying on mpirun to launch several identical processes. This scenario may have issues different from the more classical strategy. Oddly, it seems that the shared memory transport does not work at all: if I use "orterun --mca btl sm,self <job>", the code always crashes the machine. What does work on my box is: orterun --mca btl_tcp_if_include lo <job> This never crashes the machine, but it does not work in a chroot (for lack of a loopback interface, I guess). I get this error message: adt-run [19:01:17]: test python-gyoto-mpi: [----------------------- [tantive-iv:26356] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file ess_hnp_module.c at line 170 -------------------------------------------------------------------------- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_plm_base_select failed --> Returned value Not found (-13) instead of ORTE_SUCCESS -------------------------------------------------------------------------- [tantive-iv:26356] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 128 -------------------------------------------------------------------------- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_set_name failed --> Returned value Not found (-13) instead of ORTE_SUCCESS -------------------------------------------------------------------------- [tantive-iv:26356] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file orterun.c at line 694 adt-run [19:01:18]: test python-gyoto-mpi: -----------------------] adt-run [19:01:18]: test python-gyoto-mpi: - - - - - - - - - - results - - - - - - - - - - python-gyoto-mpi FAIL non-zero exit status 243 thibaut@tantive-iv:~/git/gyoto$ Kind regards, Thibaut. > > > [1] > http://anonscm.debian.org/cgit/debian-science/packages/esys-particle.git/tree/debian/tests/build1#n7 > [2] > https://anonscm.debian.org/cgit/debian-science/packages/liggghts.git/tree/debian/tests/heat > [3] > https://anonscm.debian.org/cgit/debian-science/packages/liggghts.git/tree/debian/tests/packing > > Best regards > > Anton > > Anton > > > 2015-05-23 10:41 GMT+02:00 Thibaut Paumard <[email protected]>: >> Hi, >> >> I'm working on autopkgtest support in one of my packages, gyoto. >> >> The upcoming upstream release (preview available on our alioth git repo) >> features MPI parallelisation, and I want to test this feature. >> >> In my experience, running MPI code requires network access. Failing >> that, openmpi hangs the machine. For instance, when debugging in the >> subway, I have to connect to my cell-phone by wifi else the computer >> will freeze during the test suite! >> >> I'm wondering whether putting "Restrictions: isolation-container" in the >> test stanza is sufficient to ensure openmpi will behave properly? >> >> Also akin, is there a way to test the code during build, since it is >> forbidden to access the network at that time? >> >> Kind regards, Thibaut. >> >> >> -- >> To UNSUBSCRIBE, email to [email protected] >> with a subject of "unsubscribe". Trouble? Contact [email protected] >> Archive: https://lists.debian.org/[email protected] >> > > -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: https://lists.debian.org/[email protected]

