Andrew Somorjai, le Tue 20 Nov 2012 09:45:12 +0100, a écrit : > I'm also confused about these two lines and whether its necessary for the > second one to exist? > > HANDLE thread[num_threads]; > HANDLE pthread_getw32threadhandle_np(thread); > > Does the second api call fill the thread array or just the first element?
It does not fill anything, it returns the converted value. The second api call should be done between pthread_create and hwloc_set_thread_cpubind, as it needs to be called for each thread. Otherwise it's not surprising that the threads are going around: if you checked the error returned by hwloc_set_thread_cpubind, you would see that it says the thread id is invalid. What you need to understand is that that pthread_create fills a pthread_t, not a HANDLE. That's why one then needs to use pthread_getw32threadhandle_np to convert from the pthread_t into the HANDLE before passing it to hwloc_set_thread_cpubind. I.e. > pthread_t thread[num_threads]; > > for (t = 0; t < num_threads; t++) > { > printf("Creating thread %ld\n", t); > rc = pthread_create(&thread[t], NULL, threaded_task, (void *)t); > HANDLE handle = pthread_getw32threadhandle_np(thread[t]); > > hwloc_bitmap_t bitmap = hwloc_bitmap_alloc(); > hwloc_bitmap_set_only(bitmap, t); > hwloc_set_thread_cpubind(topology, handle, bitmap, 0); > hwloc_bitmap_free(bitmap); In addition to that, remember what I mentioned in a previous mail (Mon, 19 Nov 2012 23:36:09 +0100): using hwloc_bitmap_set_only will use physical indexes, which are most probably not what you want because they depend on phases of the moon. Depending whether you want to execute one thread per core, or per hyperthread, use the first or second of these: rc = pthread_create(&thread[t], NULL, threaded_task, (void *)t); HANDLE handle = pthread_getw32threadhandle_np(thread[t]); hwloc_set_thread_cpubind(topology, handle, hwloc_get_obj_by_type(topology, HWLOC_OBJ_CORE, t), 0); rc = pthread_create(&thread[t], NULL, threaded_task, (void *)t); HANDLE handle = pthread_getw32threadhandle_np(thread[t]); hwloc_set_thread_cpubind(topology, handle, hwloc_get_obj_by_type(topology, HWLOC_OBJ_PU, t), 0); and use n = hwloc_get_nbobjs_by_type(topology, HWLOC_OBJ_CORE); or n = hwloc_get_nbobjs_by_type(topology, HWLOC_OBJ_PU); to get the number of cores or hyperthreads. Samuel