On Wed, 8 May 2019 19:56:01 +0200 Kenneth Hoste <[email protected]> wrote:
> This is clearly a bug in OpenBLAS that could have been prevented. > I haven't studied this issue in detail myself yet, but I have seen > comments pass by from OpenBLAS maintainers who say they don't have > Skylake hardware to test on. > That makes me wonder how well the rest of OpenBLAS is tested, which is a > bit infuriating for a library that important. Situation seems very similar to OpenSSL ... a library that was taken for granted by the whole world until some serious issues were found. One wonders how many planes have to crash or rockets explode until some effort is put behind OpenBLAS ... > On the EasyBuild side, I think we have a couple of options for mitigation: > > 1) Add eaysconfigs for the latest version of OpenBLAS to the next > EasyBuild release (v3.9.1) which can be used to swap out the OpenBLAS > included in recent foss toolchains. > I suspect simply doing a "module swap" to the newer OpenBLAS version is > sufficient in most cases (if OpenBLAS was not statically linked, and if > RPATH is not used). > > 2) Modify the toolchain definition of foss/2018b (and foss/2019a?) to > use the newer OpenBLAS version. > I'm not sure if this is too drastic or not, but it would be up to each > site to decide whether or not they want to update their already foss > modules to pick on this or not. > > 3) Collect test programs/scripts/benchmarks in a central repository > (easybuild-testing?), so we can assess the stability of future OpenBLAS > versions that we consider for inclusion in the 'foss' toolchains. > > You could state that this isn't our 'job', but if the OpenBLAS > maintainers are not capable of properly testing their releases on recent > hardware, then I guess it's our duty to try and catch problems like this > ourselves before they blow up in our faces weeks (or months) later. > > Anyone who would be up for helping out with this? > For now we should definitely focus on covering this OpenBLAS issue well, > but I can see this thing growing out as another central repo where we > pool together efforts done on testing/benchmarking on top of modules > installed with EasyBuild... I'm currently working on #2 but will try do do something on #3 as well. -- Jurij Pečar HPC Engineer, IT Operations, IT Services EMBL Heidelberg, Meyerhofstraße 1, 69117, Heidelberg, Germany Room 13-401

