If this fix was submitted as a PR for OpenBLAS 0.3.1 which is part of the foss-2018b toolchain would it be accepted? One of the main purposes of EasyBuild is to facilitate reproducible science. Changing a published easyconfig, is breaking the rules. There is a huge amount of flawed software that we need to deal with, but the OpenBLAS issue will affect analysis results. Rebuilding OpenBLAS in place should not create any issues; I assume that nothing is statically linked to OpenBLAS.
Moving to 2019a is not an option for me at this time, -- John Dey HPC Operations Scientific Computing O 206.667.4308<tel:(206)%20667-4308> M 360.649.2731<tel:(360)%20649-2731> E [email protected]<mailto:[email protected]> [signature_639330222] Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N<https://maps.google.com/?q=1100+Fairview+Ave.+N&entry=gmail&source=g>., Mail Stop J3-516 Seattle, WA 98109 fredhutch.org<http://www.fredhutch.org/> From: <[email protected]> on behalf of Carlos Fenoy <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Tuesday, May 28, 2019 at 5:32 AM To: easybuild <[email protected]> Subject: Re: [easybuild] Openblas(foss) matrix issue Hi, After fighting a long time with this, we managed to get a solution that passes both the "Openblas_matrix_issue" and "BLAS_tester" test suites. To solve the issue we had to apply a patch and add a new build parameter (USE_SIMPLE_THREADED_LEVEL3=1) to OpenBLAS to make it work with multiple openmp threads. This is how the buildopts line looks like for us: buildopts = ' USE_SIMPLE_THREADED_LEVEL3=1 BINARY=64 USE_THREAD=1 USE_OPENMP=1 CC="$CC" FC="$F77" DYNAMIC_ARCH=1' And the patch, we got it from this commit on the OpenBLAS repo: https://github.com/xianyi/OpenBLAS/commit/b14f44d2adbe1ec8ede0cdf06fb8b09f3c4b6e43<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_xianyi_OpenBLAS_commit_b14f44d2adbe1ec8ede0cdf06fb8b09f3c4b6e43&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=3TR-iteG1SyRqQ5yubQg-_2KIAToz9bj5dZrRdW36Hc&m=7KHaWVsIxKa5sDpJjJICIsYxR8q-MepJUDMgU7AA-3o&s=OablXo14euTScZg4SKMDgHQje2Ush-zVGAbHbOnmfYY&e=> (you can get the patch by adding .patch at the end of the URL) Regards, Carlos On Mon, May 27, 2019 at 6:15 PM Pablo Escobar Lopez <[email protected]<mailto:[email protected]>> wrote: Hi, did anyone found a working patch or workaround for the matrix issue when using OpenBLAS-0.3.1 ? After a lot of try&error I couldn't pass the tests in https://github.com/eylenth/Openblas_matrix_issue<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_eylenth_Openblas-5Fmatrix-5Fissue&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=3TR-iteG1SyRqQ5yubQg-_2KIAToz9bj5dZrRdW36Hc&m=7KHaWVsIxKa5sDpJjJICIsYxR8q-MepJUDMgU7AA-3o&s=TExoymruAD12Fab-0XkLZdeeTtYMhJwNUWSMer8Pf4Y&e=> when using https://github.com/easybuilders/easybuild-easyconfigs/blob/master/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.1-GCC-7.3.0-2.30.eb<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_easybuilders_easybuild-2Deasyconfigs_blob_master_easybuild_easyconfigs_o_OpenBLAS_OpenBLAS-2D0.3.1-2DGCC-2D7.3.0-2D2.30.eb&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=3TR-iteG1SyRqQ5yubQg-_2KIAToz9bj5dZrRdW36Hc&m=7KHaWVsIxKa5sDpJjJICIsYxR8q-MepJUDMgU7AA-3o&s=faT0z2Z2fXzk3DW3aSoXMe1TUc86cGIhPds_d8XHUFg&e=> . No matter what patches, toolchainopts or buildopts I use (and I have tried few different combinations) . Is anyone able to pass the tests using openblas-0.3.1 ? I could pass the tests using openblas-0.3.5 but upgrading my foss/2018b toolchain would be quite messy because I use RPATH. The less intrusive solution for my users would be to be able to patch openblas-0.3.1 somehow but I couldn't find a working solution. Any suggestions? regards, Pablo. p.s. in a related topic, IMHO unless there is a proper workaround I would suggest to stop providing openblas-0.3.1 with easybuild. Right now we are distributing a broken library On Tue, May 7, 2019 at 6:34 PM Mikael Öhman <[email protected]<mailto:[email protected]>> wrote: Hi Thomas, I can also confirm these issues. I tried rebuilding OpenBLAS+R after the fix in #7180, but I still saw the same problems. Very large matrix-matrix multiplications randomly gave the wrong result. Very large errors. The larger the matrix, the more frequent the errors. In the end, I compiled an intel-version (but I had to remove a few extensions that didn't build) and removed my Foss version from our installations. Perhaps it's related to hardware; I saw this on happen skylake servers. I haven't had time to check if this https://github.com/easybuilders/easybuild-easyconfigs/issues/8197<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_easybuilders_easybuild-2Deasyconfigs_issues_8197&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=3TR-iteG1SyRqQ5yubQg-_2KIAToz9bj5dZrRdW36Hc&m=7KHaWVsIxKa5sDpJjJICIsYxR8q-MepJUDMgU7AA-3o&s=RMROVzKOqcVBnJr-PzJIYIB3wcJGjeHZfx07zOfk29Y&e=> also affects 0.3.1 Best regards, Mikael On Tue, May 7, 2019 at 6:12 PM Thomas Eylenbosch <[email protected]<mailto:[email protected]>> wrote: Hello Some of our end users reported a calculation issue with matrices when they are working with a foss/2018b module I reproduced this error with Python and R that are compiled with the foss/2018b toolchain, the output returns unexcepted results. Then I reproduced this error with Python and R that are compiled with the foss/2016b toolchain , then it gives me the expected behavior. You can reproduce this error with the following github repository: https://github.com/eylenth/Openblas_matrix_issue<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_eylenth_Openblas-5Fmatrix-5Fissue&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=3TR-iteG1SyRqQ5yubQg-_2KIAToz9bj5dZrRdW36Hc&m=7KHaWVsIxKa5sDpJjJICIsYxR8q-MepJUDMgU7AA-3o&s=TExoymruAD12Fab-0XkLZdeeTtYMhJwNUWSMer8Pf4Y&e=> I have also tried to recompile the OpenBLAS-0.3.1-GCC-7.3.0-2.30.eb easyconfig file with “toolchainopts = {'vectorize': False}” ( cfr. https://github.com/easybuilders/easybuild-easyconfigs/issues/7180<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_easybuilders_easybuild-2Deasyconfigs_issues_7180&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=3TR-iteG1SyRqQ5yubQg-_2KIAToz9bj5dZrRdW36Hc&m=7KHaWVsIxKa5sDpJjJICIsYxR8q-MepJUDMgU7AA-3o&s=lNOZuvwF_eL4Al8AIZrEB-s9R4-uGoEpDhpcI6syET4&e=>) But is still giving me unexpected behavior Can someone try to reproduce the error with the R/Python(foss/2018b) modules. Or can someone give me feedback on this? Thank you in advance. Met vriendelijke groet / Kind regards / Beste Grüße Thomas Eylenbosch Ext: Gluo N.V. BASF Agricultural Solutions Belgium NV Technologiepark 101 B-9052 Ghent (Zwijnaarde) BELGIUM E-mail: [email protected]<mailto:[email protected]> [cid:[email protected]] BASF Agricultural Solutions Belgium NV, Registered Office: 9052 Gent, Belgium Registration: RPR Gent: 0685.756.742 -- Pablo Escobar López Linux/HPC systems engineer sciCORE, University of Basel SIB Swiss Institute of Bioinformatics

