> On Sun, 11 Sep 2011 10:22:30 -0400 > chm <[email protected]> wrote: > > Has anyone seen performance benefit from the > new auto pthread capability? > > When I run the t/pthread_auto.t test on an > AMD Athlon(tm) X2 Dual Core machine I see no > win from pthreads. It would seem that the > performance gain might depend on the complexity > of the calculation being threaded and on the > number of cores. > > Data points anyone? > > --Chris
Hi. First off, the test was broken, but it seems you already fixed it (unthreaded control case was actually set to 10-way threaded). I just ran some experiments to see just how beneficial extra threads are, and it is clear that the benchmarking reported by the test is misleading. It reports the wall-clock timing with a resolution of 1 second (way too coarse to be useful) and a user timing with a resolution of 0.01 seconds. The user timing counts CPU time, so it's USELESS here. If 5 cores each spend 1 second doing something, the user timing would be 5 seconds, even though the whole point of the automatic threading was to reduce wall-clock timing by increasing user timing. I increased the resolution of the wall-clock timing by replacing the 'use Benchmark' in the test header to use Benchmark ':hireswallclock'; use Time::HiRes; If it's acceptable to require that Time::HiRes is available, we should make this change permanent I think. This gives us useful wall-clock numbers, so I ran some tests to see how adding threads affects the computation time. I did this with the stock computation in the test ( $a += 1 ) and a more complicated computation to try to reduce the overhead costs ( $a = random(2000000); $a **= 1.3 ). The timings were done on a recent 8-core Intel machine running a recent Debian/unstable install. Wall-clock timings: | set_autopthread_targ | += 1 (500 times) | **= 1.3 (10 times) | |----------------------+------------------+--------------------| | 0 | 1.90 | 2.15 | | 1 | 1.90 | 2.15 | | 2 | 1.17 | 1.10 | | 3 | 1.15 | 1.10 | | 4 | 0.91 | 0.56 | | 5 | 0.89 | 0.45 | | 6 | 0.90 | 0.45 | | 7 | 0.90 | 0.46 | | 8 | 0.80 | 0.29 | | 9 | 0.80 | 0.29 | | 10 | 0.93 | 0.39 | We can clearly see that extra threads make things go quicker. We can clearly see that the heavier computation benefits more from extra threads (lower relative overhead costs to maintain the threads). There's an interesting discrete nature to the improvement: adding a 4th thread makes a huge difference, while adding a 3rd doesn't at all. This may be due to the way the auto-threading is implemented. We can also see that when we have more threads than cores, the extra threads are a burden, not an improvement. dima _______________________________________________ Perldl mailing list [email protected] http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
