[gentoo-dev] Finalizing: GSoC-2019: BLAS/LAPACK Runtime Switching

2019-07-24 Thread Mo Zhou
Hi Gentoo Developers,

I'm finalizing the GSoC2019 project "BLAS and LAPACK runtime switch".[1]
The
documentation for user and developer is available here:

  https://wiki.gentoo.org/wiki/Blas-lapack-switch

BLAS and LAPACK are dense numerical algebra libraries of
"libc-importance" to
scientific computing users. The runtime switching mechanism enables
users to
easily switch the BLAS/LAPACK library system-wide, without recompiling
anything. A similar feature has been long-existing in Debian system, as
known
as the update-alternatives mechanism. In Gentoo we implemented this
feature
with eselect modules.

This mechanism has been tested by some users and gentoo science team
developers.
Thanks to these early testers, we've got some positive feed backs:

  https://github.com/gentoo/sci/issues/805#issuecomment-510469206
  https://github.com/gentoo/sci/issues/805#issuecomment-512097570

I sincerely invite users and developers who heavily rely on BLAS/LAPACK
libraries to test it. Should you find any problem, or have any
suggestion/question, please let me know :-)

[1]
https://wiki.gentoo.org/wiki/Google_Summer_of_Code/2019/Ideas/BLAS_and_LAPACK_runtime_switching



Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching (Re-designed)

2019-06-17 Thread Mo Zhou
Hi Michał,

Sorry for the late reply. Just encountered some severe hardware failure.

On 2019-06-13 07:49, Michał Górny wrote:
>>
>> sci-libs/{blas,cblas,lapack,lapacke}::gentoo should be deprecated. They
>> are based on exactly the same source tarball, and maintaining 4 ebuild
>> files for a single tarball is not a good choice IHMO. Those old ebuild
>> files seems to leverage the flexibility of upstream build system
>> because it enables one to, for example, skip the reference blas build
>> and use an existing optimized BLAS impelementation and hence introduce
>> flexibility. That flexibility is hard to maintain and is not necessary
>> anymore with the new runtime switching mechanism.
>>
>> That's why I propose to merge the 4 ebuild into a single one:
>> sci-libs/lapack. We don't need to add the "reference" postfix
>> because no upstream will loot the name "lapack". When talking
>> about "lapack" it's always the reference implementation.
> 
> What's the real gain here, and how does it compare to loss of
> flexibility of being able to build only what the package in question
> needs?

First let's see what these 4 components are:
1. blas: written in fortran, provides fundamental linear algebra
 routines. libblas.so can work alone.
2. cblas: a thin C wrapper around the fortran blas. that means
 libcblas.so calls libblas.so for the real calculation.
3. lapack: written in fortran, frequently calls BLAS for
 implementing higher level linear algebra routines.
 liblapack.so needs libblas.so (fortran).
4. lapacke: a thin C wrapper around the fortran lapack.
 liblapacke.so needs liblapack.so.

The real gain by merging 4 ebuilds into 1 ebuild:
1. easier to maintain, updating 4 ebuilds on every single
   version bump is much harder compared to updating only 1.
   This will also make it easier to provide and maintain
   the virtual-* features for long run.
2. could avoid confusing or even potentially problematic
   setups, e.g.: A user happened to compile OpenBLAS for
   the libblas provider, and BLIS for the libcblas provider:

   appA -> libblas (OpenBLAS)
   appB -> libcblas (BLIS)
   appC -> liblapacke (Ref) -> liblapack (Ref) -> libblas (OpenBLAS)
-> libcblas.so (BLIS)

   The user will get him/herself confused on what BLAS
   is really doing the calculation. Plus, sometimes
   mixing threading model may cause poor performance
   (e.g. openmp + pthread) or even silent corruption
   (e.g. GNU openmp + Intel openmp).

   Merging cblas into blas, and lapacke into lapack
   will make it harder to get things wrong.

IHMO that mentioned flexibility is not really necessary. Any
scientific computing user who needs performance and dislikes
the virtual-* solution could directly link their programs
against MKL or openblas without thinking about the reference
blas, because both MKL and OpenBLAS provides the full set
of blas,cblas,lapack,lapacke API and ABI via a single shared
object. Plus, that flexibility could be replaced by the
proposed runtime switching solution: by alternating
the blas(cblas) selection, liblapack.so can be dynamicly
linked against different optimized implementations.

Discarding this flexibility will only affect users who
insist on linking an unoptimized lapack against a specific
blas implementation. And one may also fall into trouble
with such flexibility, e.g.:

   libcblas (Reference) -> libblas.so (reference)
   liblapack (Reference) -> libopenblas.so

   appC -> (liblapacke, libcblas)
--> liblapacke -> liblapack -> libopenblas
--> libcblas (reference)

   libopenblas's ABI is a superset of those of libcblas,
   which indicates confusion and symbol race condition
   during run-time.

With the proposed (redesigned) solution, these potentially
bad cases could be avoided because the solution trys to keep
the backend consistency. Some people had headache on the
BLAS/LAPACK flexibility and they created flexiblas.

In a word, the (4->1) change can reudce the maintaining cost
for (blas,cblas,lapack,lapacke) and make the virtual-* feature
easier to implement and maintain for long run. Additionally,
the flexibility mentioned before is not really necessary when
the virtual-* feature is fully implemented.

Best,
Mo.



Re: [gentoo-dev] RFC: BLAS and LAPACK runtime switching (Re-designed)

2019-06-13 Thread Mo Zhou
Hi Gentoo devs,

I redesigned the solution for BLAS/LAPACK runtime switching.
New solution is based on eselect+ld.so.conf . See following.

> Goal
> 
> 
>   * When a program is linked against libblas.so or liblapack.so
> provided by any BLAS/LAPACK provider, the eselect-based solution
> will allow user to switch the underlying library without recompiling
> anything.

Instead of manipulating symlinks, I wrote a dedicated eselect module
for BLAS:

https://github.com/cdluminate/my-overlay/blob/master/app-eselect/eselect-blas/files/blas.eselect-0.2

This implementation will generate a corresponding ld.so.conf file
on switching and refresh ld.so cache.

advantages:

1. not longer manipulates symlinks under package manager directory,
   see https://bugs.gentoo.org/531842 and https://bugs.gentoo.org/632624

2. we don't have to think about static lib and header switching like
   Debian does.

>   * When a program is linked against a specific implementation, e.g.
> libmkl_rt.so, the solution doesn't break anything.

This still holds with the new solution.

> Solution
> 
> 
> Similar to Debian's update-alternatives mechanism, Gentoo's eselect
> is good at dealing with drop-in replacements as well. My preliminary

The redesigned solution totally diverted from Debian's solution.

* sci-libs/lapack provides standard API and ABI for BLAS/CBLAS/LAPACK
  and LAPACKE. It provides a default set of libblas.so, libcblas.so,
  and liblapack.so . Reverse dependencies linked against the three
  libraries (reference blas) will take advantage of the runtime
  switching mechanism through USE="virtual-blas virtual-lapack".
  Reverse dependencies linked to specific implementations such as
  libopenblas.so won't be affected at all.

* every non-standard BLAS/LAPACK implementations could be registered
  as alternatives via USE="virtual-blas virtual-lapack". Once the
  virtual-* flags are toggled, the ebuild file will build some
  extra shared objects with correct SONAME.

  For example:

  /usr/lib64/libblis.so.2 (SONAME=libblis.so.2, general purpose)
  /usr/lib64/blas/blis/libblas.so.3 (USE="virtual-blas",
SONAME=libblas.so.3)
  /usr/lib64/blas/blis/libcblas.so.3 (USE="virtual-blas",
SONAME=libcblas.so.3)

* Reverse dependencies of BLAS/LAPACK could optionally provide the
  "virtual-blas virtual-lapack" USE flags.

  if use virtual-*:
  link against reference blas/lapack
  else:
  link against whatever the ebuild maintainer like and get rid
  of the switching mechanism

> Proposed Changes
> 
> 
> 1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo
>main repo. They use exactly the same source tarball. It's not quite
>helpful to package these components in a fine-grained manner. A
> single
>sci-libs/lapack package is enough.

sci-libs/{blas,cblas,lapack,lapacke}::gentoo should be deprecated. They
are based on exactly the same source tarball, and maintaining 4 ebuild
files for a single tarball is not a good choice IHMO. Those old ebuild
files seems to leverage the flexibility of upstream build system
because it enables one to, for example, skip the reference blas build
and use an existing optimized BLAS impelementation and hence introduce
flexibility. That flexibility is hard to maintain and is not necessary
anymore with the new runtime switching mechanism.

That's why I propose to merge the 4 ebuild into a single one:
sci-libs/lapack. We don't need to add the "reference" postfix
because no upstream will loot the name "lapack". When talking
about "lapack" it's always the reference implementation.

> 2. Merge the "cblas" eselect unit into "blas" unit. It is potentially
>harmful when "blas" and "cblas" point to different implementations.
>That means "app-eselect/eselect-cblas" should be deprecated.

eselect-cblas should be deprecated. That affects gsl because it is
registered as an cblas alternative. gsl doesn't provide the standard
BLAS (fortran) API+ABI so it will not be added as a runtime switching
candidate.

Does this redesinged solution look acceptable now?

Best,
Mo.



[gentoo-dev] RFC: BLAS and LAPACK runtime switching

2019-05-28 Thread Mo Zhou
Hi Gentoo devs,

Classical numerical linear algebra libraries, BLAS[1] and LAPACK[2]
play important roles in the scientific computing field, as many
software such as Numpy, Scipy, Julia, Octave, R are built upon them.

There is a standard implementation of BLAS and LAPACK, named netlib
or simply "reference implementation". This implementation had been
provided by gentoo's main repo. However, it has a major problem:
performance. On the other hand, a number of well-optimized BLAS/LAPACK
implementations exist, including OpenBLAS (free), BLIS (free),
MKL (non-free), etc., but none of them has been properly integrated
into the Gentoo distribution.

I'm writing to propose a good solution to this problem. If no gentoo
developer is object to this proposal, I'll keep moving forward and
start submitting PRs to Gentoo main repo.

Historical Obstacle
---

Different BLAS/LAPACK implementations are expected to be compatible
to each other in both the API and ABI level. They can be used as
drop-in replacement to the others. This sounds nice, but the difference
in SONAME hampered the gentoo integration of well-optimized ones.

Assume a Gentoo user compiled a pile of packages on top of the reference
BLAS and LAPACK, namely these reverse dependencies are linked against
libblas.so.3 and liblapack.so.3 . When the user discovered that
OpenBLAS provides much better performance, they'll have to recompile
the whole reverse dependency tree in order to take advantage from
OpenBLAS,
because the SONAME of OpenBLAS is libopenblas.so.0 . When the user
wants to try MKL (libmkl_rt.so), they'll have to recompile the whole
reverse dependency tree again.

This is not friendly to our earth.

Goal


  * When a program is linked against libblas.so or liblapack.so
provided by any BLAS/LAPACK provider, the eselect-based solution
will allow user to switch the underlying library without recompiling
anything.

  * When a program is linked against a specific implementation, e.g.
libmkl_rt.so, the solution doesn't break anything.

Solution


Similar to Debian's update-alternatives mechanism, Gentoo's eselect
is good at dealing with drop-in replacements as well. My preliminary
investigation suggests that eselect is enough for enabling BLAS/LAPACK
runtime switching. Hence, the proposed solution is eselect-based:

  * Every BLAS/LAPACK implementation should provide generic library
and eselect candidate libraries at the same time. Taking netlib,
BLIS and OpenBLAS as examples:

reference:

  usr/lib64/blas/reference/libblas.so.3 (SONAME=libblas.so.3)
-- default BLAS provider
-- candidate of the eselect "blas" unit
-- will be symlinked to usr/lib64/libblas.so.3 by eselect

  usr/lib64/lapack/reference/liblapack.so.3 (SONAME=liblapack.so.3)
-- default LAPACK provider
-- candidate of the eselect "lapack" unit
-- will be symlinked to usr/lib64/liblapack.so.3 by eselect

blis (doesn't provide LAPACK):
  
  usr/lib64/libblis.so.2  (SONAME=libblis.so.2)
-- general purpose

  usr/lib64/blas/blis/libblas.so.3 (SONAME=libblas.so.3)
-- candidate of the eselect "blas" unit
-- will be symlinked to usr/lib64/libblas.so.3 by eselect
-- compiled from the same set of object files as libblis.so.2

openblas:
  
  usr/lib64/libopenblas.so.0 (SONAME=libopenblas.so.0)
-- general purpose

  usr/lib64/blas/openblas/libblas.so.3 (SONAME=libblas.so.3)
-- candidate of the eselect "blas" unit
-- will be symlinked to usr/lib64/libblas.so.3 by eselect
-- compiled from the same set of object files as
libopenblas.so.0

  usr/lib64/lapack/openblas/liblapack.so.3 (SONAME=liblapack.so.3)
-- candidate of the eselect "lapack" unit
-- will be symlinked to usr/lib64/liblapack.so.3 by eselect
-- compiled from the same set of object files as
libopenblas.so.0

This solution is similar to Debian's[3]. This solution achieves our
goal,
and it requires us to patch upstream build systems (same to Debian).
Preliminary demonstration for this solution is available, see below.

Is this solution reliable?
--

* A similar solution has been used by Debian for many years.
* Many projects call BLAS/LAPACK libraries through FFI, including Julia.
  (See Julia's standard library: LinearAlgebra)

Proposed Changes


1. Deprecate sci-libs/{blas,cblas,lapack,lapacke}-reference from gentoo
   main repo. They use exactly the same source tarball. It's not quite
   helpful to package these components in a fine-grained manner. A
single
   sci-libs/lapack package is enough.

2. Merge the "cblas" eselect unit into "blas" unit. It is potentially
   harmful when "blas" and "cblas" point to different implementations.
   That means "app-eselect/eselect-cblas" should be deprecated.

3. Update virtual/{blas,cblas,lapack,lapacke}. BLAS/LAPACK providers