Some updates. Previously I conducted the experiments with LD_PRELOAD=...libflame.so to override the lapack symbols. When really use libflame.so as a liblapack.so provider, programs may end up failing due to some missing symbols.
Now, the missing symbols are borrowed from liblapack_pic.a (liblapack-dev) for building the shared lapack.so (libflame). A package built from the master branch should be usable as a general liblapack.so provider. Programs no longer fail with errors such as "undefined symbols". Speed gain has been sustained. Apart from that, I noted that 5 of the numpy unit tests failed with libflame::lapack.so backend. On Mon, Dec 02, 2019 at 01:13:04PM +0000, Mo Zhou wrote: > Hi science team, > > As usual, I'd like to inform the team before registering a new lapack > implementation into our blas/lapack ecosystem. The new implementation > is called "libflame", from the upstream of BLIS: > https://github.com/flame/libflame > Similar to BLIS, it is a lapack-like object-based implementation, and > provides a compatibility layer to the traditional (fortran) lapack > called "lapack2flame". > > I noticed this library because it's one of the AMD's reviving math library > stack (to some extent the MKL counterpart?): > https://developer.amd.com/amd-aocl/ > It is also noted that AMD upstreamed their patches to BLIS upstream. > That's a healthy phenomenon. > > My preliminary tests of single precision SVD factorization demonsrate a > significant improvement over the netlib lapack and the openblas > lapack[1] implementation. Please find the results in the last part of > this mail. > > Given these obvervations, I propose to > > * set the priority value of `libflame` (as a liblapack.so.3 provider) > to 80, > > because 1) I'm still not sure wether the libflame compat layer provides > the complete ABI; 2) We have not tested is sufficiently; 3) 80 is close > to the BLIS priority values (for libblas.so.3). > > --- > > My test code can be found in the MKL packaging: > https://salsa.debian.org/science-team/intel-mkl/blob/master/debian/tests/test-gesvd.cc > Preliminary packaging can be found here: > https://salsa.debian.org/science-team/libflame > Switching alternatives has been made easy by my tiny util: > https://tracker.debian.org/pkg/rover > > Results on Xeon Gold 6126 (sgesvd_, 512x512 matrix size): > > BLAS=openblas LAPACK=openblas -> ~560ms # pthread > BLAS=atlas LAPACK=atlas -> N/A # cgesvdq_ symbol not found > > BLAS=netlib LAPACK=netlib -> ~820ms > BLAS=atlas LAPACK=netlib -> ~600ms > BLAS=blis LAPACK=netlib -> ~560ms # BLIS_NUM_THREADS=1 > > BLAS=netlib LAPACK=libflame -> ~700ms > BLAS=atlas LAPACK=libflame -> ~490ms > BLAS=blis LAPACK=libflame -> ~415ms # BLIS_NUM_THREADS=1 > BLAS=openblas LAPACK=libflame -> ~415ms > > I didn't compare it with MKL (non-free). That's unnecessary. > > [1] openblas lapack $\approx$ netlib lapack, except for a few routines. >

