I don't know any implementation details, but is making a 16-bit counter a 32-bit counter really so much harder than this fancy (overengineered? ;-) ) table construction? The way I see it, this table which might become a real mess if there are multiple MPI_Comm_spawn issued simultaneously in different communicators. (Would that be legal MPI?)
Anyway, just my $0.01 (we don't get so many dollars for our euros anymore...) -Andreas On 17:02 Mon 27 Oct , Jeff Squyres wrote: > How about a variation on that idea: keep a global bitmap or some other kind > of "this ID is in use" table. Hence, if the launch counter rolls over, you > can simply check the table to find a free value. That way, you can be sure > to never re-use a value that is still being used. > > So we'd have 16 bits to express this counter, but we could introduce a > limit of how many concurrent spawns we support. Hence, the IDs can be > large, but we only allow having N distinct values at any one time (quite > similar to PIDs and an OS process table). We can specify the value of N > via configure, an MCA parameter, ...whatever. If the MPI job tries to have > more than N concurrent spawned jobs, it's an error. But for a job that > continuously spawns jobs that each die off in short finite time, it'll be > no problem. The counter will likely cycle around, but won't run into any > problems as long as there are <N total spawns still running. > > <waving hands a bit> There's probably some off-by-one errors in the above > paragraph, but you get the idea. :-) > > > On Oct 22, 2008, at 2:59 PM, Ralph Castain wrote: > >> I can't swear to this because I haven't fully grokked it yet, but I >> believe the answer is: >> >> 1. if child jobs have completed, it won't hurt. I think the various >> subsystem cleanup their bookkeeping when a job completes, so we could >> possibly reuse the number. Might be some race conditions we would have >> to resolve. >> >> 2. if child jobs haven't completed (which is the situation this >> particular user was attempting), then we would have a problem with >> jobid confusion. Once we get the procs launched, though, I'm not sure >> how much of a problem there is - would have to investigate. Could >> cause some bookkeeping problems for job completion. >> >> Interesting possibility, though...consider it another option for now. >> >> >> >> On Oct 22, 2008, at 12:53 PM, George Bosilca wrote: >> >> > What's happened if we roll around with the counter ? >> > >> > george. >> > >> > On Oct 22, 2008, at 2:49 PM, Ralph Castain wrote: >> > >> >> There recently was activity on the mailing lists where someone was >> >> attempting to call comm_spawn 100,000 times. Setting aside the >> >> threading issues that were the focus of that exchange, the fact is >> >> that OMPI currently cannot handle that many comm_spawns. >> >> >> >> The ORTE jobid is composed of two elements: >> >> >> >> 1. the top 16-bits is an "identifier" for that mpirun >> >> >> >> 2. the lower 16-bits is a running counter identifying the specific >> >> job/launch for those procs. >> >> >> >> Thus, we are limited to 64k comm_spawns. >> >> >> >> Expanding this would require either revamping the entire way we >> >> handle jobs (e.g., removing the mpirun identifier - major effort), >> >> or expanding the orte_jobid_t from its current 32-bits to 64-bits. >> >> >> >> Is this a problem we want to address? >> >> Ralph >> >> >> >> _______________________________________________ >> >> devel mailing list >> >> de...@open-mpi.org >> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- ============================================ Andreas Schäfer Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany 0049/3641-9-46376 PGP/GPG key via keyserver I'm a bright... http://www.the-brights.net ============================================ (\___/) (+'.'+) (")_(") This is Bunny. Copy and paste Bunny into your signature to help him gain world domination!
pgpmPkth9p_mv.pgp
Description: PGP signature