> The TSC between the two cpu's are not synchronized with each other, > so your results are going to be really messed up.
The TSC was so skewed between the processors that it was very easy to automatically filter out the occasional bogus result. Either they were a negative number, or they were on the order of ten magnitudes larger then every other result. Perhaps there were problems elsewhere though. > what you are seeing are probably scheduler artifacts determined by which > cpu(s) the threads are scheduled on. Okay. Then there's a good reason libc_r doesn't get affected by that problem ;) > Thread creation times are probably not a very useful statistic. libc_r > doesn't create real threads, libthread_xu does. Right, and that's exactly what I wanted to show in my report, that userland threads always are faster in that respect. > So libthread_xu will always take a bit longer to create the thread. > But once created, the kernel does a much better job managing But what could have created the skew with fork() ? libthread_xu was consistently faster every time I ran the tests. As a sidenote my program gets a bus error and dumps core, in about 50% of the runs when running with libthread_xu. // CLEAR THE ARRAY for (i = 0; i < 1000; i++) tt[i] = 0; // ADD +1 FOR EVERY OCCURENCE OF A CERTAIN RESULT for (i = 0; i < RUNS; i++) tt[ti1[i]] = tt[ti1[i]] + 1; // WRITE THE RESULT TO A FILE for (i = 0; i < 1000; i++) fprintf(pth_cf, "%d %d\n", i, tt[i]); // CLEAR THE ARRAY AGAIN for (i = 0; i < 1000; i++) tt[i] = 0; // LIBTHREAD_XU DIES 50% OF THE TIME ON THE FOLLOWING LINE for (i = 0; i < RUNS; i++) tt[ti2[i]] = tt[ti2[i]] + 1; for (i = 0; i < 1000; i++) fprintf(pth_jf, "%d %d\n", i, tt[i]); ~ Robert
