WHAT: Should we make the job size (i.e., initial number of procs) available in OPAL?
WHY: At least 2 BTLs are using this info (*more below) WHERE: usnic and ugni TIMEOUT: there's already been some inflammatory emails about this; let's discuss next Tuesday on the teleconf: Tue, 5 Aug 2014 MORE DETAIL: This is an open question. We *have* the information at the time that the BTLs are initialized: do we allow that information to go down to OPAL? Ralph added this info down in OPAL in r32355, but George reverted it in r32361. Points for: YES, WE SHOULD +++ 2 BTLs were using it (usinc, ugni) +++ Other RTE job-related info are already in OPAL (num local ranks, local rank) Points for: NO, WE SHOULD NOT --- What exactly is this number (e.g., num currently-connected procs?), and when is it updated? --- We need to precisely delineate what belongs in OPAL vs. above-OPAL FWIW: here's how ompi_process_info.num_procs was used before the BTL move down to OPAL: - usnic: for a minor latency optimization / sizing of a shared receive buffer queue length, and for the initial size of a peer lookup hash - ugni: to determine the size of the per-peer buffers used for send/recv communication -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/