Just rebuilt OpenBLAS 0.2.20 locally on the test system with GCC 6.1.0, and I'm only getting 91 GFLOPS. I'm pretty sure OpenBLAS performance should be close to ACML performance, if not better. I'll have to dig into this later. For now, I'm going to continue my testing using the ACML-based build and revisit the OpenBLAS performance later.


On 02/22/2018 05:27 PM, Prentice Bisbal wrote:
So I just rebuilt HPL using the ACML 6.1.0 libraries with GCC 6.1.0, and I'm now getting 197 GFLOPS, so clearly there's a problem with my OpenBLAS build. I'm going to try building OpenBLAS without the dynamic arch support on the machine where I plan on running my tests, and see if that version of the library is any better.


On 02/22/2018 09:37 AM, Prentice Bisbal wrote:

In your experience, how close does actual performance of your processors match up to their theoretical performance? I'm investigating a performances issue on some of my nodes. These are older systems using AMD Opteron 6274 processors. I found literature from AMD stating the theoretical performance of these processors is 282 GFLOPS, and my LINPACK performance isn't coming close to that (I get approximately ~33% of that).  The number I often hear mentioned is actual performance should be ~85%. of theoretical performance is that a realistic number your experience?

I don't want this to be a discussion of what could be wrong at this point, we will get to that in future posts, I assure you!

Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 

Reply via email to