On Wed, Jan 13, 2010 at 5:52 PM, William Stein <wst...@gmail.com> wrote:
> What matters for this benchmark is the number of cores that the computer has.
> Though t2 can manage 128 hardware threads, it only has 16 actual *cores*.

Not quite; the following is in a box with 8 cores -- 16 threads:

sage: time b = bernoulli(10^5, algorithm='bernmm', num_threads=1)
CPU times: user 4.84 s, sys: 0.00 s, total: 4.84 s
Wall time: 4.84 s
sage: timeit("bernoulli(10^5, algorithm='bernmm', num_threads=8)")
5 loops, best of 3: 1.06 s per loop
sage: timeit("bernoulli(10^5, algorithm='bernmm', num_threads=16)")
5 loops, best of 3: 914 ms per loop

That's 15% speedup for the extra threads... Not bad, given that going
4 --> 8 threads on the same machine or going  8 --> 16 threads on
sage.math gives about 30% speedup for the same computation.

I'd guess for a really cpu-bound task, the number of cores should be
it, but for memory-bound tasks, having more threads may be an
advantage because of memory latencies, as long as the threads don't
have to compete for cache. In the end, whatever optimizes the usage of
cache size / memory bandwidth between threads is probably best.

Also, HT could optimize the use of arithmetic units between threads in
the same core, so unless the inner loops have perfect scheduling,
there's always something to gain from this approach.

Best, Gonzalo
-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to