Hi science team, As usual, I'd like to inform the team before registering a new lapack implementation into our blas/lapack ecosystem. The new implementation is called "libflame", from the upstream of BLIS: https://github.com/flame/libflame Similar to BLIS, it is a lapack-like object-based implementation, and provides a compatibility layer to the traditional (fortran) lapack called "lapack2flame".
I noticed this library because it's one of the AMD's reviving math library stack (to some extent the MKL counterpart?): https://developer.amd.com/amd-aocl/ It is also noted that AMD upstreamed their patches to BLIS upstream. That's a healthy phenomenon. My preliminary tests of single precision SVD factorization demonsrate a significant improvement over the netlib lapack and the openblas lapack[1] implementation. Please find the results in the last part of this mail. Given these obvervations, I propose to * set the priority value of `libflame` (as a liblapack.so.3 provider) to 80, because 1) I'm still not sure wether the libflame compat layer provides the complete ABI; 2) We have not tested is sufficiently; 3) 80 is close to the BLIS priority values (for libblas.so.3). --- My test code can be found in the MKL packaging: https://salsa.debian.org/science-team/intel-mkl/blob/master/debian/tests/test-gesvd.cc Preliminary packaging can be found here: https://salsa.debian.org/science-team/libflame Switching alternatives has been made easy by my tiny util: https://tracker.debian.org/pkg/rover Results on Xeon Gold 6126 (sgesvd_, 512x512 matrix size): BLAS=openblas LAPACK=openblas -> ~560ms # pthread BLAS=atlas LAPACK=atlas -> N/A # cgesvdq_ symbol not found BLAS=netlib LAPACK=netlib -> ~820ms BLAS=atlas LAPACK=netlib -> ~600ms BLAS=blis LAPACK=netlib -> ~560ms # BLIS_NUM_THREADS=1 BLAS=netlib LAPACK=libflame -> ~700ms BLAS=atlas LAPACK=libflame -> ~490ms BLAS=blis LAPACK=libflame -> ~415ms # BLIS_NUM_THREADS=1 BLAS=openblas LAPACK=libflame -> ~415ms I didn't compare it with MKL (non-free). That's unnecessary. [1] openblas lapack $\approx$ netlib lapack, except for a few routines.

