Thomas Fischbacher wrote on Wed, Feb 04, 2004 at 12:31:54PM +0100: 
> 
> I frequently experience that due to its overly long pipeline, the P4 
> frequently behaves a bit "strange" in benchmarks. While having such a long 
> pipeline is a drawback in many situations, I also have seen some rare 
> cases where it leads to tremendous, almost incredible speed gains in 
> LISP code when comparing to scaled-up P3 values.

I can second the notion that P-4s behave very nonstandard.

I few things to keep in mind:
- the P4 does all of bit-shifting and integer division and
  multiplication slower than previous processors.  Arranging your data
  so that you don't have to mask tag bits in and out is a huge win
- the P4 has the trace cache which makes function calls very cheap up
  to a given call depth.  Unfortunately, unless I overlook something,
  our x86 port doesn't make use of this feature because we do function
  calls by call which don't invoke the trace cache.

Both make inlining a big win, and makes using of arrays of untagged
numbers more valuable.

If you want to see something truly awful, observe Franz Allegro 5.x
code on a Pentium-4.  An early 2.4 GHz P-4 with normal RAM (no RAMBUS)
was precisely the same speed as a 1 GHz P-3.


Another anecdote I can share on the Pentium-4: 

for a private project I was debugging code in multi-threaded C++
program that had worker threads doing gzip decompression and a master
thread first arraging work, then starting the threads, then waiting
for them to finish.  On a Pentium-4 it deadlocked, on everything else
it was working like a charm.

Turns out that on the Pentium-4 a straightforward algorithm like zlib
decompression is so blazingly fast that the first decompression worker
could finish decompression of a substancial file faster than the
master thread would take just systemcalling to arrange a few threads
and then initiate wait on a condition variable - causing the wakeup
call of the worker to arrive before the master was waiting for the
signal, dropping the wakeup on the floor, leading to deadlock.

Martin
-- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Martin Cracauer <[EMAIL PROTECTED]>   http://www.cons.org/cracauer/
 No warranty.    This email is probably produced by one of my cats 
 stepping on the keys. No, I don't have an infinite number of cats.

Reply via email to