What's happened if we roll around with the counter ?

  george.

On Oct 22, 2008, at 2:49 PM, Ralph Castain wrote:

There recently was activity on the mailing lists where someone was attempting to call comm_spawn 100,000 times. Setting aside the threading issues that were the focus of that exchange, the fact is that OMPI currently cannot handle that many comm_spawns.

The ORTE jobid is composed of two elements:

1. the top 16-bits is an "identifier" for that mpirun

2. the lower 16-bits is a running counter identifying the specific job/launch for those procs.

Thus, we are limited to 64k comm_spawns.

Expanding this would require either revamping the entire way we handle jobs (e.g., removing the mpirun identifier - major effort), or expanding the orte_jobid_t from its current 32-bits to 64-bits.

Is this a problem we want to address?
Ralph

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to