> On Sun, 11 Sep 2011 10:22:30 -0400
> chm <[email protected]> wrote:
>
> Has anyone seen performance benefit from the
> new auto pthread capability?
> 
> When I run the t/pthread_auto.t test on an
> AMD Athlon(tm) X2 Dual Core machine I see no
> win from pthreads.  It would seem that the
> performance gain might depend on the complexity
> of the calculation being threaded and on the
> number of cores.
> 
> Data points anyone?
> 
> --Chris

Hi.

First off, the test was broken, but it seems you already fixed it (unthreaded
control case was actually set to 10-way threaded). I just ran some experiments
to see just how beneficial extra threads are, and it is clear that the
benchmarking reported by the test is misleading. It reports the wall-clock
timing with a resolution of 1 second (way too coarse to be useful) and a user
timing with a resolution of 0.01 seconds. The user timing counts CPU time, so
it's USELESS here. If 5 cores each spend 1 second doing something, the user
timing would be 5 seconds, even though the whole point of the automatic
threading was to reduce wall-clock timing by increasing user timing.

I increased the resolution of the wall-clock timing by replacing the 'use
Benchmark' in the test header to

use Benchmark ':hireswallclock';
use Time::HiRes;

If it's acceptable to require that Time::HiRes is available, we should make this
change permanent I think.

This gives us useful wall-clock numbers, so I ran some tests to see how adding
threads affects the computation time. I did this with the stock computation in
the test ( $a += 1 ) and a more complicated computation to try to reduce the
overhead costs ( $a = random(2000000); $a **= 1.3 ). The timings were done on a
recent 8-core Intel machine running a recent Debian/unstable install. Wall-clock
timings:


| set_autopthread_targ | += 1 (500 times) | **= 1.3 (10 times) |
|----------------------+------------------+--------------------|
|                    0 |             1.90 |               2.15 |
|                    1 |             1.90 |               2.15 |
|                    2 |             1.17 |               1.10 |
|                    3 |             1.15 |               1.10 |
|                    4 |             0.91 |               0.56 |
|                    5 |             0.89 |               0.45 |
|                    6 |             0.90 |               0.45 |
|                    7 |             0.90 |               0.46 |
|                    8 |             0.80 |               0.29 |
|                    9 |             0.80 |               0.29 |
|                   10 |             0.93 |               0.39 |

We can clearly see that extra threads make things go quicker. We can clearly see
that the heavier computation benefits more from extra threads (lower relative
overhead costs to maintain the threads). There's an interesting discrete nature
to the improvement: adding a 4th thread makes a huge difference, while adding a
3rd doesn't at all. This may be due to the way the auto-threading is
implemented. We can also see that when we have more threads than cores, the
extra threads are a burden, not an improvement.

dima

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to