A few thoughts on this as a high level:
1.  Most of the libraries don't support runtime dispatch (libsimdpp seems
to be the exception here), so we should decide if we want to roll our own
dynamic dispatch mechanism.
2.  It isn't clear to me in the linked PR if the performance delta between
SIMD generated code and what the compiler would generate.  For simple
aggregates of non-null data I would expect pretty good auto-vectorization.
Compiler auto-vectorization seems to get better over time.  For instance
the scalar example linked in the paper seems to get vectorized somewhat
under Clang 10 (https://godbolt.org/z/oPopQL).
3.  It appears there are some efforts to make a standardized C++ library
[1] which might be based on Vc.

My initial thought on this is that in the short-term would be to focus on
the dynamic dispatch question (continue to build our own vs adopt an
existing library) and lean the compiler for most vectorization. Using
intrinsics should be limited to complex numerical functions and places
where the compiler fails to vectorize/translate well (e.g. bit
manipulations).

If we do find the need for a dedicated library I would lean towards
something that will converge to a standard to reduce additional
dependencies in the long run. That being said most of these libraries seem
to be header only so the dependency is fairly light-weight, so we can
vendor them if need-be.

[1] https://en.cppreference.com/w/cpp/experimental/simd





On Tue, Jun 9, 2020 at 3:32 AM Antoine Pitrou <anto...@python.org> wrote:

>
> Thank you.  xsimd used to require C++14, but apparently they have
> demoted it to C++11.  Good!
>
> Regards
>
> Antoine.
>
>
> Le 09/06/2020 à 12:04, Maarten Breddels a écrit :
> > Hi Antoine,
> >
> > Adding xsimd to the list of options:
> >  * https://github.com/xtensor-stack/xsimd
> > Not sure how it compares to the rest though.
> >
> > cheers,
> >
> > Maarten
> >
>

Reply via email to