On Wed, 09 Mar 2005 13:52:30 +0100, Matthias Waldhauer
<[EMAIL PROTECTED]> wrote:

> So you are providing 2 threads to each physical CPU. Since both threads work 
> on their
> working set, fill the cache and cause data to be thrown out of the cache, 
> they will
> negatively affect eachother more than e.g. Prime95 and the idle task. Also 
> the usage
> of the integer units could have some effect.

To get to the bottom of this, I have done some benchmarking with both
Prime95 and my own parallel software (pbzip2).  From my testing, I
have concluded that the big "overhead" that I see is caused by the OS
itself and how it handles scheduling (as Brian B. suggested).

If I run two instances of Prime95 using an exponent in the M34XXXXXX
range, I get 0.073 sec/iteration when the machine is otherwise idle
with hyperthreading disabled.  If I enable hyperthreading, then the
speed varies widely depending on where WinXP decides the run the
process.  If I specifically set the affinity of each Prime95 instance
to a physical CPU, then the speed goes back to a constant 0.073
sec/iteration.

With my own pbzip2 data compression software, you can really see how
well (or not) an OS does with scheduling.  I tested the software using
between 1 and 4 threads on WinXP Pro, Linux 2.4.27, and Linux 2.6.9. 
I have attached the results as a graph which clearly shows that all 3
of the operating systems run pbzip2 with the same speed using 2 CPUs
when HT is disabled.  When you enable HT, both XP and Linux 2.4 do a
horrible job with 2 threads, while Linux 2.6 achieves the same
performance using 2 threads as they all do with HT disabled.

I realize that HT is just a virtual processor so using 3 and 4 threads
was just for fun to see what would happen.  Using 3 threads got a
slight performance increase with all OSes getting about the same
results.  Using 4 threads is slower than 3 threads but faster than 2
threads.

So the moral of the story is that Windows XP does not handle
hyperthreading very well.  If you are running Prime95, you should
definitely set the affinity to the physical processor (which is CPU 0
if you have 1 processor [CPU 1 is the virtual], and CPU 0 and CPU 1 if
you have 2 processors [CPU 2 and 3 are the virtuals]).

If you run Linux and have a hyperthreaded machine, it would definitely
be worth your while to upgrade to the 2.6 kernel.

Thanks for the interesting comments/discussion Brian and Matthias.

Regards,
Jeff.

<<attachment: SpeedupGraph.gif>>

_______________________________________________
Prime mailing list
[email protected]
http://hogranch.com/mailman/listinfo/prime

Reply via email to