On 3/27/24 08:39, jesse koops wrote:
Of course you are correct about the portability. But since at ;least
one other CRAN package by a renowned author does it succesfully, I
figured I'd experiment first on my machine and learn about portability
later. Thank you for the links and the warning about the bug. I was
aware of that, however I am careful to only use the "loadu" and
"storeu" variants, so I thought this would not bite me. Do you know if
my assumption is in error?

My advice is please do not publish any packages doing this low level stuff unless you fully understand the details yourself. If you don't, please work at a higher level abstraction and use existing code for the low-level things, to avoid adding to the maintenance costs. These things can take very long to debug.

The GCC bug on Windows I've ran into only affects instructions that require aligned operands (on the stack), aligned at 32-byte boundary.

Tomas


Op di 26 mrt 2024 om 15:51 schreef Tomas Kalibera <tomas.kalib...@gmail.com>:

On 3/26/24 10:53, jesse koops wrote:
Hello R-package-devel,

I recently got inspired by the rcppsimdjson package to try out simd
registers. It works fantastic on my computer but I struggle to find
information on how to make it portable. It doesn't help in this case
that R and Rcpp make including Cpp code so easy that I have never had
to learn about cmake and compiler flags. I would appreciate any help,
including of the type: "go read instructions at ...".

I use RcppArmadillo and Rcpp. I currenlty include the following header:

#include <immintrin.h>

The functions in immintrin that I use are:

_mm256_loadu_pd
_mm256_set1_pd
_mm256_mul_pd
_mm256_fmadd_pd
_mm256_storeu_pd

and I define up to four __m256d registers. From information found
online (not sure where anymore) I constructed the following makevars
file:

CXX_STD = CXX14

PKG_CPPFLAGS = -I../inst/include -mfma -msse4.2 -mavx

PKG_CXXFLAGS = $(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)

(I also use openmp, that has always worked fine, I just included all
lines for completeness) Rcheck gives me two notes:

─  using R version 4.3.2 (2023-10-31 ucrt)
─  using platform: x86_64-w64-mingw32 (64-bit)
─  R was compiled by
         gcc.exe (GCC) 12.3.0
         GNU Fortran (GCC) 12.3.0

❯ checking compilation flags used ... NOTE
    Compilation used the following non-portable flag(s):
      '-mavx' '-mfma' '-msse4.2'

❯ checking C++ specification ... NOTE
      Specified C++14: please drop specification unless essential

But as far as I understand, the flags are necessary, at least in GCC.
How can I make this portable and CRAN-acceptable?
I think it the best way for portability is to use a higher-level library
that already has done the low-level business of maintaining multiple
versions of the code (with multiple instruction sets) and choosing one
appropriate for the current CPU. It could be say LAPACK, BLAS, openmp,
depending of the problem at hand. In some cases, code can be rewritten
so that the compiler can vectorize it better, using the level of
vectorized instructions that have been enabled.

Unconditionally using GCC-specific or architecture-specific options in
packages would certainly not be portable. Even on Windows, R is now used
also with clang and on aarch64, so one should not assume a concrete
compiler and architecture.

Please note also that GCC on Windows has a bug due to which AVX2
instructions cannot be used reliably - the compiler doesn't always
properly align local variables on the stack when emitting these. See
[1,2] for more information.

Best
Tomas

[1] https://stat.ethz.ch/pipermail/r-sig-windows/2024q1/000113.html
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

kind regards,
Jesse

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Reply via email to