As per the RFC below, I'll begin rolling these changes into the trunk over the next week.
> WHAT: Begin the process of introducing threads and thread safety into ORTE > > WHY: ORTE is becoming increasingly dependent on thread-safe operations > (lock, cond_wait, unlock). However, OPAL thread support is defined > to no-ops > unless --enable-opal-multi-threads is set. We need an independent > way > of ensuring thread-safety in ORTE is active as doing so at the OPAL > level > negatively impacts the MPI layer. > > WHERE: Solely inside the ORTE code tree. > > WHEN: No real rush - somewhere in the 1.5 series > > TIMEOUT: Aug 13 > > ---------------------------------------------------------------------------- > Steps to be completed for Stage 1: > > 1. copy the opal thread code into a new orte/threads directory, renaming and > editing as required > > 2. create ORTE_THREAD_[UN]LOCK macros that are always defined and active. > Since ORTE isn't a performance-critical code path, we will always lock/unlock > as required to protect global data (should help resolve some of our lingering > thread-related problems). We will do a global search/replace for the OPAL > macros inside the ORTE code tree and replace them with the new ORTE > equivalents. > > 3. repackage the orte_job_data, orte_node_pool, orte_nidmap, and orte_jobmap > global arrays into a new wrapper class that includes ORTE thread-locking > support plus a pointer array. This will allow ORTE to thread-safe these > values independent of whether or not OPAL threads are enabled. > > 4. add thread-lock/release code around areas where the globals in #3 are used. > >