https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80822
--- Comment #3 from Nathan Weeks <weeks at iastate dot edu> --- Setting OMP_DISPLAY_ENV=verbose results in the following output with Intel 17.0.2: ================================================================================ OPENMP DISPLAY ENVIRONMENT BEGIN _OPENMP='201511' [host] KMP_ABORT_DELAY='0' [host] KMP_ADAPTIVE_LOCK_PROPS='1,1024' [host] KMP_ALIGN_ALLOC='64' [host] KMP_ALL_THREADPRIVATE='256' [host] KMP_ALL_THREADS='2147483647' [host] KMP_ATOMIC_MODE='2' [host] KMP_BLOCKTIME='200' [host] KMP_CPUINFO_FILE: value is not defined [host] KMP_DETERMINISTIC_REDUCTION='FALSE' [host] KMP_DISP_NUM_BUFFERS='7' [host] KMP_DUPLICATE_LIB_OK='FALSE' [host] KMP_FORCE_REDUCTION: value is not defined [host] KMP_FOREIGN_THREADS_THREADPRIVATE='TRUE' [host] KMP_FORKJOIN_BARRIER='2,2' [host] KMP_FORKJOIN_BARRIER_PATTERN='hyper,hyper' [host] KMP_FORKJOIN_FRAMES='TRUE' [host] KMP_FORKJOIN_FRAMES_MODE='3' [host] KMP_GTID_MODE='3' [host] KMP_HANDLE_SIGNALS='FALSE' [host] KMP_HOT_TEAMS_MAX_LEVEL='1' [host] KMP_HOT_TEAMS_MODE='0' [host] KMP_INIT_AT_FORK='TRUE' [host] KMP_INIT_WAIT='2048' [host] KMP_ITT_PREPARE_DELAY='0' [host] KMP_LIBRARY='throughput' [host] KMP_LOCK_KIND='queuing' [host] KMP_MALLOC_POOL_INCR='1M' [host] KMP_NEXT_WAIT='1024' [host] KMP_NUM_LOCKS_IN_BLOCK='1' [host] KMP_PLAIN_BARRIER='2,2' [host] KMP_PLAIN_BARRIER_PATTERN='hyper,hyper' [host] KMP_REDUCTION_BARRIER='1,1' [host] KMP_REDUCTION_BARRIER_PATTERN='hyper,hyper' [host] KMP_SCHEDULE='static,balanced;guided,iterative' [host] KMP_SETTINGS='FALSE' [host] KMP_SPIN_BACKOFF_PARAMS='4096,100' [host] KMP_STACKOFFSET='64' [host] KMP_STACKPAD='0' [host] KMP_STACKSIZE='4M' [host] KMP_STORAGE_MAP='FALSE' [host] KMP_TASKING='2' [host] KMP_TASK_STEALING_CONSTRAINT='1' [host] KMP_USER_LEVEL_MWAIT='FALSE' [host] KMP_VERSION='FALSE' [host] KMP_WARNINGS='TRUE' [host] OMP_CANCELLATION='FALSE' [host] OMP_DEFAULT_DEVICE='0' [host] OMP_DISPLAY_ENV='VERBOSE' [host] OMP_DYNAMIC='FALSE' [host] OMP_MAX_ACTIVE_LEVELS='2147483647' [host] OMP_MAX_TASK_PRIORITY='0' [host] OMP_NESTED='FALSE' [host] OMP_NUM_THREADS='32' [host] OMP_PLACES='threads' [host] OMP_PROC_BIND='spread' [host] OMP_SCHEDULE='static' [host] OMP_STACKSIZE='4M' [host] OMP_THREAD_LIMIT='2147483647' [host] OMP_WAIT_POLICY='PASSIVE' [host] KMP_AFFINITY='noverbose,warnings,respect,granularity=thread,noduplicates,compact,0,0' OPENMP DISPLAY ENVIRONMENT END ================================================================================ For comparison, the Cray 8.5.4 OpenMP runtime (which produces the same thread affinity as the Intel 17.0.2 OpenMP runtime in the aforementioned example) outputs the following when OMP_DISPLAY_ENV=verbose: ================================================================================ OPENMP DISPLAY ENVIRONMENT BEGIN _OPENMP='201307' OMP_SCHEDULE='static,0' OMP_NUM_THREADS='32' OMP_DYNAMIC='TRUE' OMP_NESTED='FALSE' OMP_STACKSIZE='128MB' OMP_WAIT_POLICY='ACTIVE' OMP_MAX_ACTIVE_LEVELS='1023' OMP_THREAD_LIMIT='256' CRAY_OMP_CHECK_AFFINITY='FALSE' OMP_PROC_BIND='spread' OMP_PLACES='threads' OMP_CANCELLATION='FALSE' OMP_DISPLAY_ENV='VERBOSE' OMP_DEFAULT_DEVICE='0' CRAY_OMP_GUARD_SIZE='0B' CRAY_OMP_TASK_Q_LIMIT='256' CRAY_OMP_CONTENTION_POLICY='Automatic' OPENMP DISPLAY ENVIRONMENT END ================================================================================ Also, in this environment, with OMP_NUM_THREADS=2 OMP_PLACES=threads OMP_PROC_BIND=close, the libgomp affinity results in both threads being pinned to different sockets: ================================================================================ $ OMP_NUM_THREADS=2 OMP_PLACES=threads OMP_PROC_BIND=close ./xthi-omp.gnu | sort -k 4n,4n Hello from thread 0, on nid00015. (core affinity = 0) Hello from thread 1, on nid00015. (core affinity = 1) ================================================================================ Both the Intel and Cray OpenMP runtimes pin the threads to the same physical core: ================================================================================ $ OMP_NUM_THREADS=2 OMP_PLACES=threads OMP_PROC_BIND=close ./xthi-omp.intel | sort -k 4n,4n Hello from thread 0, on nid00015. (core affinity = 0) Hello from thread 1, on nid00015. (core affinity = 32) ================================================================================ It does seem that the OpenMP 4.5 specification can be interpreted to support the libgomp behavior (e.g., p. 52 lines 33-38), though it at least seems counterintuitive.