Just rebuilt OpenBLAS 0.2.20 locally on the test system with GCC 6.1.0,
and I'm only getting 91 GFLOPS. I'm pretty sure OpenBLAS performance
should be close to ACML performance, if not better. I'll have to dig
into this later. For now, I'm going to continue my testing using the
ACML-based build and revisit the OpenBLAS performance later.
On 02/22/2018 05:27 PM, Prentice Bisbal wrote:
So I just rebuilt HPL using the ACML 6.1.0 libraries with GCC 6.1.0,
and I'm now getting 197 GFLOPS, so clearly there's a problem with my
OpenBLAS build. I'm going to try building OpenBLAS without the dynamic
arch support on the machine where I plan on running my tests, and see
if that version of the library is any better.
On 02/22/2018 09:37 AM, Prentice Bisbal wrote:
In your experience, how close does actual performance of your
processors match up to their theoretical performance? I'm
investigating a performances issue on some of my nodes. These are
older systems using AMD Opteron 6274 processors. I found literature
from AMD stating the theoretical performance of these processors is
282 GFLOPS, and my LINPACK performance isn't coming close to that (I
get approximately ~33% of that). The number I often hear mentioned
is actual performance should be ~85%. of theoretical performance is
that a realistic number your experience?
I don't want this to be a discussion of what could be wrong at this
point, we will get to that in future posts, I assure you!
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit