On Wed, 2019-05-29 at 22:33 +0800, Benda Xu wrote: > Hi Michał, > > Michał Górny <mgo...@gentoo.org> writes: > > > On Tue, 2019-05-28 at 01:37 -0700, Mo Zhou wrote: > > > Different BLAS/LAPACK implementations are expected to be > > > compatible > > > to each other in both the API and ABI level. They can be used as > > > drop-in replacement to the others. This sounds nice, but the > > > difference > > > in SONAME hampered the gentoo integration of well-optimized ones. > > > > If SONAMEs are different, then they are not compatible by > > definition. > > This blas/lapack SONAME difference is a special case. They are > parially > compatible in the sense that every alternative implementation of blas > is > a superset of the reference one. > > Therefore linking to the reference at build time will make sure the > compatibility with the alternative implementations, even with > different > SONAME. > > > > [...] > > > > > > Similar to Debian's update-alternatives mechanism, Gentoo's > > > eselect > > > is good at dealing with drop-in replacements as well. My > > > preliminary > > > investigation suggests that eselect is enough for enabling > > > BLAS/LAPACK > > > runtime switching. Hence, the proposed solution is eselect-based: > > > > > > * Every BLAS/LAPACK implementation should provide generic > > > library > > > and eselect candidate libraries at the same time. Taking > > > netlib, > > > BLIS and OpenBLAS as examples: > > > > > > reference: > > > > > > usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3) > > > -- default BLAS provider > > > -- candidate of the eselect "blas" unit > > > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > > > > /usr/lib64 is not supposed to be modified by eselect, it's package > > manager area. Yes, I know a lot of modules still do that but > > that's no > > reason to make things worse when people are putting significant > > effort > > to actually improve things. > > Sorry, I didn't see your reply before mine. > > We are going to use the LDPATH and ld.so.conf mechanism suggested by > you. > > > > usr/lib64/lapack/reference/liblapack.so.3 > > > (SONAME=liblapack.so.3) > > > -- default LAPACK provider > > > -- candidate of the eselect "lapack" unit > > > -- will be symlinked to usr/lib64/liblapack.so.3 by > > > eselect > > > > > > blis (doesn't provide LAPACK): > > > > > > usr/lib64/libblis.so.2 (SONAME=libblis.so.2) > > > -- general purpose > > > > > > usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3) > > > -- candidate of the eselect "blas" unit > > > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > > > -- compiled from the same set of object files as > > > libblis.so.2 > > > > > > openblas: > > > > > > usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0) > > > -- general purpose > > > > > > usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3) > > > -- candidate of the eselect "blas" unit > > > -- will be symlinked to usr/lib64/libblas.so.3 by eselect > > > -- compiled from the same set of object files as > > > libopenblas.so.0 > > > > > > usr/lib64/lapack/openblas/liblapack.so.3 > > > (SONAME=liblapack.so.3) > > > -- candidate of the eselect "lapack" unit > > > -- will be symlinked to usr/lib64/liblapack.so.3 by > > > eselect > > > -- compiled from the same set of object files as > > > libopenblas.so.0 > > > > > > This solution is similar to Debian's[3]. This solution achieves > > > our > > > goal, > > > and it requires us to patch upstream build systems (same to > > > Debian). > > > Preliminary demonstration for this solution is available, see > > > below. > > > > So basically the three walls of text say in round-about way that > > you're > > going to introduce custom hacks to recompile libraries with > > different > > SONAME. Ok. > > > Is this solution reliable? > > > -------------------------- > > > > > > * A similar solution has been used by Debian for many years. > > > * Many projects call BLAS/LAPACK libraries through FFI, including > > > Julia. > > > (See Julia's standard library: LinearAlgebra) > > > > > > Proposed Changes > > > ---------------- > > > > > > 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from > > > gentoo > > > main repo. They use exactly the same source tarball. It's not > > > quite > > > helpful to package these components in a fine-grained manner. > > > A > > > single > > > sci-libs/lapack package is enough. > > > > Where's the gain in that? > > > 2. Merge the "cblas" eselect unit into "blas" unit. It is > > > potentially > > > harmful when "blas" and "cblas" point to different > > > implementations. > > > That means "app-eselect/eselect-cblas" should be deprecated. > > > > > > 3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK > > > providers > > > will be registered in their dependency information. > > > > > > Note, ebuilds for BLAS/LAPACK reverse dependencies are expected > > > to work > > > with these changes correctly without change. For example, my > > > local > > > numpy-1.16.1 compilation was successful without change. > > > > > > Preliminary Demonstration > > > ------------------------- > > > > > > The preliminary implementation is available in my personal > > > overlay[4]. > > > A simple sanity test script `check-cpp.sh` is provided to > > > illustrate > > > the effectiveness of the proposed solution. > > > > > > The script `check-cpp.sh` compiles two C++ programs -- one calls > > > general > > > matrix-matrix multiplication from BLAS, while another one calls > > > general > > > singular value decomposition from LAPACK. Once compiled, this > > > script > > > will switch different BLAS/LAPACK implementations and run the C++ > > > programs > > > without recompilation. > > > > > > The preliminary result is avaiable here[5]. (CPU=Power9, > > > ARCH=ppc64le) > > > From the experimental results, we find that > > > > > > For (512x512) single precision matrix multiplication: > > > * reference BLAS takes ~360 ms > > > * BLIS takes ~70 ms > > > * OpenBLAS takes ~10 ms > > > > > > For (512x512) single precision singular value decomposition: > > > * reference LAPACK takes ~1900 ms > > > * BLIS (+reference LAPACK) takes ~1500 ms > > > * OpenBLAS takes ~1100 ms > > > > > > The difference in computation speed illustrates the effectiveness > > > of > > > the proposed solution. Theoretically, any other package could > > > take > > > advantage from this solution without any recompilation as long as > > > it's linked against a library with SONAME. > > > > An actual ABI compliance test, e.g. done using abi-compliance- > > checker > > would be more interesting. > > As said above, the symbols don't need to be 1-1 copy of each other. > Any > library which is a superset of the reference one will work.
Again, I'm willing to accept this under a USE="lapack_targets_virtual" configuration, but wholesale editing of DT_NEEDED entries is definitely too scary and too invasive for most non-sci/hpc users of Gentoo. Again, for 99% of users, OpenBLAS will be the right trade-off between performance and customizability. Every recompilation of libreoffice or chromium will devour more CPU cycles than switching between USE-flag implementations.