loop_spawn

Eugene Loh Mon, 15 Aug 2011 11:47:07 -0400

This is a question about ompi-tests/ibm/dynamic. Some of these tests(spawn, spawn_multiple, loop_spawn/child, and no-disconnect) exerciseMPI_Comm_spawn* functionality. Specifically, they spawn additionalprocesses (beyond the initial mpirun launch) and therefore exert adifferent load on a test system than one might naively expect from the"mpirun -np <np>" command line.

One approach to testing is to have the test harness know characteristicsabout individual tests like this. E.g., if I have only 8 processors andI don't want to oversubscribe, have the test harness know thatparticular tests should be launched with fewer processes. On the otherhand, building such generality into a test harness when changes wouldhave to be so pervasive (subjective assessment) and so few tests requireit may not make that much sense.

Another approach would be to manage oversubscription in the teststhemselves. E.g., for spawn.c, instead of spawning np new processes, dothe following:


- idle np/2 of the processes
- have the remaining np/2 processes spawn np/2 new ones

(Okay, so that leaves open the possibility that the newly spawnedprocesses might not appear on the same nodes where idled processes have"made room" for them. Each solution seems loaded with shortcomings.)

Anyhow, I was interested in some feedback on this topic. A very smallnumber (1-4) of spawning tests are causing us lots of problems (unduecomplexity in the test harness as well as a bunch of our time forreasons I find difficult to explain succinctly). We're inclined tomodify the tests so that they're a little more social. E.g., makedecisions about how many of the launched processes should "really" beused, idling some fraction of the processes, and continuing the testonly with the remaining fraction.


Comments?

[OMPI devel] ibm/dynamic/loop_spawn

Reply via email to