On Tue, 28 Jul 2020 at 18:02, Dave Love <lovesh...@fedoraproject.org> wrote:
>
> I'm offering the experience of doing all this work from various
> different points of view.  I don't off-hand remember instances where
> particular problems have occurred, because I've done quite a of this.
> Isn't engineering experience valuable?  No-one seems to be offering
> counter-experience to evaluate trades-off, just assertions about what
> requirements.

It is very valuable, and I appreciate your comments. But note that the
author of FlexiBLAS has also engineering experience and has been
managing a HPC cluster for a number of years too, apart from his
research in academia.

I'm certainly not an expert here. But one thing I learnt talking to
the various stakeholders is that basically everyone seems to have a
strong opinion around this, when no option seems to be universally
right for every application.

I think we can agree that this has been stalled for many years in
Fedora. Now, I'm willing to get my hands dirty and I'm just doing my
best to bring something better than we had.

> > We have a more limited choice right now. See my comment above about
> > applications linked against OpenBLAS, or ATLAS. With FlexiBLAS, there
> > are less limitations and more flexibility (and I believe the specific
> > use case you have in mind for BLIS could be discussed upstream).
>
> I don't see how that's true.  It's clearly more flexible to be able to
> have different implementations of an interface more-or-less trivially --
> a one-liner -- than to have a fixed implementation with limited choice.
> Please realize that there's more to research computing than what's
> packaged in Fedora, and there's more than x86 or single-threaded stuff.
> (More of it could be there if it wasn't such a dispiriting business.)

Yes, and if you want something different,

FLEXIBLAS=/some/other/library.so ./some_program

and you are good to go. And you still have the LD_PRELOAD trick if you
want to bypass FlexiBLAS entirely, as you can do for something that is
currently linked against e.g. OpenBLAS. In this regard, I don't see
how FlexiBLAS is any more problematic than the current status quo.

> > I'm not sure I'm following you anymore. You may also want to use 3
> > functions from OpenBLAS serial, 4 functions from OpenBLAS OpenMP, 1
> > from ATLAS, 5 from BLIS and the rest from whatever other library. But
> > it doesn't scale.
>
> That wouldn't make sense, and I'm worried by the suggestion.

That was the point. Glad we are aligned here.

> > That's obviously a dramatization. My point is that it's almost
> > impossible to cover all the very specialised HPC use cases out there.
> > But I would argue that, if there's a way to cover them all, that could
> > be achieved by adding features to FlexiBLAS, because it's the most
> > general way to solve the issue of the implementation disparity in the
> > BLAS/LAPACK landscape: just exposing a complete API and then
> > internally rewiring those calls to the appropriate libraries given
> > some configuration.
>
> It isn't the most general way to replace things at runtime.  The most
> general way is to substitute different implementations of the ABI with
> dynamic linking.  Note there's a de facto ABI, not just an API.
> Consider I want to speed up R by replacing the serial BLAS with a
> parallel one; that's fine, as in the reference I posted.  You're saying
> I shouldn't have that choice because you're going to define what's
> serial and parallel in that case.  Also I apparently shouldn't be able
> to substitute a shim to do tracing by this logic, or a malloc
> implementation to do profiling/debugging.

I didn't say that. You still are. You manage an HPC cluster. You know
how to do that in several ways, with and without FlexiBLAS. But what
if a regular R user wants to replace the serial BLAS? They don't know
how. I replied recently in the R-SIG-Fedora mailing list precisely
because an R user was having trouble with OpenBLAS serial. With
FlexiBLAS, that's very easy. You can replace the default, temporarily
or permanently, with another implementation shipped in Fedora, or with
anything else you got from another source (e.g., MKL).

> > We support a number of architectures in which BLIS doesn't perform
> > well, so I think we would agree that this rules out BLIS as a default.
> > Then, both @jussilehtola and the authors of FlexiBLAS independently
> > agree that the OpenMP version of OpenBLAS would be the best default.
>
> I haven't seen measurements to support these statements.  Where are
> they?  It's not obvious from the ones I referenced that OpenBLAS is
> generally better than BLIS in the reference I posted.

The ones you referenced are arguably too artificial. AFAIK, they have
a set of fine-tuned parameters that are hard to get right. There's no
way to know which are reasonable values for a specific application.
Also they use CPU affinity for BLIS and disabled it for OpenBLAS. Is
there any other third-party benchmark replicating these results?

--
Iñaki Úcar
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org

Reply via email to