How about a variation on that idea: keep a global bitmap or some other
kind of "this ID is in use" table. Hence, if the launch counter rolls
over, you can simply check the table to find a free value. That way,
you can be sure to never re-use a value that is still being used.
So we'd have 16 bits to express this counter, but we could introduce a
limit of how many concurrent spawns we support. Hence, the IDs can be
large, but we only allow having N distinct values at any one time
(quite similar to PIDs and an OS process table). We can specify the
value of N via configure, an MCA parameter, ...whatever. If the MPI
job tries to have more than N concurrent spawned jobs, it's an error.
But for a job that continuously spawns jobs that each die off in short
finite time, it'll be no problem. The counter will likely cycle
around, but won't run into any problems as long as there are <N total
spawns still running.
<waving hands a bit> There's probably some off-by-one errors in the
above paragraph, but you get the idea. :-)
On Oct 22, 2008, at 2:59 PM, Ralph Castain wrote:
I can't swear to this because I haven't fully grokked it yet, but I
believe the answer is:
1. if child jobs have completed, it won't hurt. I think the various
subsystem cleanup their bookkeeping when a job completes, so we could
possibly reuse the number. Might be some race conditions we would have
to resolve.
2. if child jobs haven't completed (which is the situation this
particular user was attempting), then we would have a problem with
jobid confusion. Once we get the procs launched, though, I'm not sure
how much of a problem there is - would have to investigate. Could
cause some bookkeeping problems for job completion.
Interesting possibility, though...consider it another option for now.
On Oct 22, 2008, at 12:53 PM, George Bosilca wrote:
> What's happened if we roll around with the counter ?
>
> george.
>
> On Oct 22, 2008, at 2:49 PM, Ralph Castain wrote:
>
>> There recently was activity on the mailing lists where someone was
>> attempting to call comm_spawn 100,000 times. Setting aside the
>> threading issues that were the focus of that exchange, the fact is
>> that OMPI currently cannot handle that many comm_spawns.
>>
>> The ORTE jobid is composed of two elements:
>>
>> 1. the top 16-bits is an "identifier" for that mpirun
>>
>> 2. the lower 16-bits is a running counter identifying the specific
>> job/launch for those procs.
>>
>> Thus, we are limited to 64k comm_spawns.
>>
>> Expanding this would require either revamping the entire way we
>> handle jobs (e.g., removing the mpirun identifier - major effort),
>> or expanding the orte_jobid_t from its current 32-bits to 64-bits.
>>
>> Is this a problem we want to address?
>> Ralph
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Jeff Squyres
Cisco Systems