Re: [Beowulf] quad-core SPECfp2006: where are 4 FPresults/cycle ?

2007-10-13 Thread Mikhail Kuzminsky
In message from Mark Hahn [EMAIL PROTECTED] (Fri, 12 Oct 2007 16:09:05 -0400 (EDT)): This means that 2 additional FP results per cycle in microarchitecture gives only about 7% of performance increase :-( the 4 flops/cycle is really for linpack-like code: it assumes you are executing packed

Re: [Beowulf] quad-core SPECfp2006: where are 4 FPresults/cycle ?

2007-10-13 Thread Mikhail Kuzminsky
In message from [EMAIL PROTECTED] (Fri, 12 Oct 2007 20:50:08 +): Mikhail, I am not sure I fully understand what you are presenting here, but I might say that yes, at the FPU unit level the series AMD Opteron/Barcelona and the Intel Core2/Clovertown (and also Harpertown at 45 nm) are

[Beowulf] quad-core SPECfp2006: where are 4 FPresults/cycle ?

2007-10-12 Thread Mikhail Kuzminsky
I found 1st AMD quad core (Opteron 2347/1.9 Ghz) SPECfp2006 results (at www.spec.org) obtained by IBM: 11.2/10.7 for peak/base values. I'll say about 1 core only, i.e. for results w/Autoparallel=NO. Let me look to other x86-64 microarchitecture w/same 4*64 bit FP results per cycle, i.e. Intel

Re: [Beowulf] quad-core SPECfp2006: where are 4 FPresults/cycle ?

2007-10-12 Thread Mark Hahn
This means that 2 additional FP results per cycle in microarchitecture gives only about 7% of performance increase :-( the 4 flops/cycle is really for linpack-like code: it assumes you are executing packed double SIMD. The question is - should we wait some better results for new incoming

Re: [Beowulf] quad-core SPECfp2006: where are 4 FPresults/cycle ?

2007-10-12 Thread richard . walsh
-- Original message -- From: Mikhail Kuzminsky [EMAIL PROTECTED] But if I'll compare SPECfp2006 results w/x86-64 microarchitecture w/2*64 bit FP results per cycle - previous Opteron generation - I'll see some strange (IMHO) result. So, for Opteron SE/3 Ghz,