WHAT: Begin the process of introducing threads and thread safety into ORTE

WHY: ORTE is becoming increasingly dependent on thread-safe operations
           (lock, cond_wait, unlock). However, OPAL thread support is defined 
to no-ops
           unless --enable-opal-multi-threads is set. We need an independent way
           of ensuring thread-safety in ORTE is active as doing so at the OPAL 
level
           negatively impacts the MPI layer.

WHERE: Solely inside the ORTE code tree.

WHEN: No real rush - somewhere in the 1.5 series

TIMEOUT: Aug 13

----------------------------------------------------------------------------
Steps to be completed for Stage 1:

1. copy the opal thread code into a new orte/threads directory, renaming and 
editing as required

2. create ORTE_THREAD_[UN]LOCK macros that are always defined and active. Since 
ORTE isn't a performance-critical code path, we will always lock/unlock as 
required to protect global data (should help resolve some of our lingering 
thread-related problems). We will do a global search/replace for the OPAL 
macros inside the ORTE code tree and replace them with the new ORTE equivalents.

3. repackage the orte_job_data, orte_node_pool, orte_nidmap, and orte_jobmap 
global arrays into a new wrapper class that includes ORTE thread-locking 
support plus a pointer array. This will allow ORTE to thread-safe these values 
independent of whether or not OPAL threads are enabled.

4. add thread-lock/release code around areas where the globals in #3 are used.



Reply via email to