Hi Brice,
I made the corrections in the program below. The thing I noticed now is that the cores used by theĀ program switch around. There are two cores used at 100% but after a few seconds another core takes over at 100% and there were always two cores used at maximum, while the other 6 (I have 8) just run normal around 10-15% doing background stuff; where as the windows api calls made use of my first 2 cores for the entire duration of the program. I used process explorer on windows xp to show the system information for each cpu. The graphs were definitely different from the windows api program. #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <math.h> #include <hwloc.h> /* A task that takes some time to complete. The id identifies distinct tasks for printed messages. */ hwloc_topology_t topology; void *task(int id) { printf("Task %d started\n", id); int i; double result = 0.0; for (i = 0; i < 1000000000; i++) { result = result + sin(i) * tan(i); } printf("Task %d completed with result %e\n", id, result); } /* Same as 'task', but meant to be called from different threads. */ void *threaded_task(void *t) { long id = (long) t; printf("Thread %ld started\n", id); task(id); printf("Thread %ld done\n", id); pthread_exit(0); } /* Run 'task' num_tasks times serially. */ void *parallel(int num_tasks) { int num_threads = num_tasks; //pthread_t thread[num_threads]; HANDLE thread[num_threads]; HANDLE pthread_getw32threadhandle_np(thread); int rc; long t; for (t = 0; t < num_threads; t++) { printf("Creating thread %ld\n", t); rc = pthread_create(&thread[t], NULL, threaded_task, (void *)t); hwloc_bitmap_t bitmap = hwloc_bitmap_alloc(); hwloc_bitmap_only(bitmap, t); hwloc_set_thread_cpubind(topology, thread[t], bitmap, 0); if (rc) { printf("ERROR: return code from pthread_create() is %d\n", rc); exit(-1); } hwloc_bitmap_free(bitmap); } } int main() { hwloc_topology_init(&topology); parallel(2); pthread_exit(NULL); } Andrew