> Hmm thanks, that's interesting -- I was think it was probably caused > by OS X, but it appears to happen on Linux too. Could you try running > the old code too, and see if you experience the order of magnitude > slowdown too?
The original program on my Linux 2.6.26 Core2 Duo: [EMAIL PROTECTED] Test]$ time ./tr-threaded 1000000 37 real 0m0.635s user 0m0.530s sys 0m0.077s [EMAIL PROTECTED] Test]$ time ./tr-nothreaded 1000000 37 real 0m0.352s user 0m0.350s sys 0m0.000s [EMAIL PROTECTED] Test]$ time ./tr-threaded 1000000 +RTS -N2 37 real 0m13.954s user 0m4.333s sys 0m5.736s -------------------------- Seeing as there still was obviously not enough computation to justify the OS threads in my last example, I made a test where it hashed a 32 byte string (show . md5 . encode $ val): [EMAIL PROTECTED] Test]$ time ./threadring-nothreaded 1000000 50 552 real 0m1.408s user 0m1.323s sys 0m0.083s [EMAIL PROTECTED] Test]$ time ./threadring-threaded 1000000 50 552 real 0m1.948s user 0m1.807s sys 0m0.143s [EMAIL PROTECTED] Test]$ time ./threadring-threaded 1000000 +RTS -N2 552 50 real 0m1.663s user 0m1.427s sys 0m0.237s [EMAIL PROTECTED] Test]$ --------------------------- Seeing as this still doesn't beat the old RTS, I decided to increase the per unit work a little more. This code will hash 10KB every time the token is passed / decremented. [EMAIL PROTECTED] Test]$ time ./threadring-nothreaded 100000 (308,77851ef5e9e781c04850a7df9cc855d2) real 2m56.453s user 2m55.399s sys 0m0.457s [EMAIL PROTECTED] Test]$ time ./threadring-threaded 100000 (308,77851ef5e9e781c04850a7df9cc855d2) real 3m6.430s user 3m5.868s sys 0m0.460s [EMAIL PROTECTED] Test]$ time ./threadring-threaded 100000 +RTS -N2 (810,77851ef5e9e781c04850a7df9cc855d2) (308,77851ef5e9e781c04850a7df9cc855d2) real 1m55.616s user 2m47.982s sys 0m3.586s * Yes, I notice its exiting before the output gets printed a couple times, oh well. ------------------------- REFLECTION Yay, the multicore version pays off when the workload is non-trivial. CPU utilization is still rather low for the -N2 case (70%). I think the Haskell threads have an affinity for certain OS threads (and thus a CPU). Perhaps it results in a CPU having both tokens of work and the other having none? _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe