Hello,

I am trying to understand some weird performance characteristics
on a newly purchased blade (see statistics below).

The hardware is an HP BL465 with 2 dual core AMD HE 2216 processors.
This is the first AMD and first 64 processor we have bought.

I installed SL44 x86_64 and we did some performance tests.

When running a single job (compute bound monte-carlo with HBOOK output)
the performance was about twice as slow as running on our
Intel based blade. Although this difference
could be attributed to difference in
proccesors, running several single
jobs in a row produced rather erratic results...
200-300 seconds different on a 900 second job.
Some were comparable to the 32 bit processor, some were not.

Also, running 4 of the same jobs in parallel
produced results which were almost twice as fast !

I then (for fun) installed SL43 x86_64 .  This produced results
quite different than those on SL44 and more compatible with
our 32 bit blades.

Below  is a sample of the CPU statistics 

We first ran the existing 32 bit executable.

We then recompiled and ran the 64 bit executable.

Many of our jobs cannot be recompiled (won't compile on gcc 3.4 or have
missing libraries) so we would really like to understand this performance
discrepency on 32 bit executables and SL44.

32 bit executable single job

     SL 44                          SL43
       906 sec                       556 sec
  
32 bit executable 4 jobs in parallel

   SL44                          SL43

job 1   452 sec                 446 sec
job 2   446 sec                 442 sec
job 3   445 sec                 444 sec 
job 4   448 sec                 446 sec


64 bit executable single job

    510 sec                    497  sec
    The 64 bit executable seems to be a little more predictable


So, does anyone  have any idea 

  1. Why such a difference in performance between SL44 and SL43 (Why does
     SL44 produce much slower results on a single job)

  2. Why running 4 jobs in parallel produces faster results than
      a single job ? One would think jobs running in parallel
      would produce slightly slower performance.

  3. Why running 4 jobs in parallel on SL44 produces much
     faster results (900 sec vs 452 sec) .

  4. Should we not be running our 32 bit executables with an
     SLxx  x86_64  installed ?
     I have not yet tried installing SL44(43) x86 to check the
     performance. Should I ?
    



Thanks for any insight or help

Regards
Rochelle Lauer
Yale University Physics

Reply via email to