Hi Peter,

I would recommend that you hold off on this change for a bit. We are
planning on moving the non-blocking threadpool from unsupported into core
in the near future and use that as the basis for multi-threading without
depending on OpenMP.

Rasmus

On Mon, Sep 13, 2021 at 5:21 AM Peter <[email protected]> wrote:

> Dear all,
>
> I'm currently playing with the sparse matrix implementation within eigen,
> namely Eigen::SparseMatrix<double, Eigen::ColMajor> combined with a self
> adjoint view.
>
> In my application I need the multiplication of a symmetric sparse matrix
> with a dense matrix,
> where the dimension of the matrices are of the order of up to a few ten
> thousands.
>
> I tried to parallelize the dot product in SparseSelfAdjointView.h by
> copying a omp directive
> from other parts in eigen, and I tried to parallelize the outer loop
> directly
> with std::execution::par, see below.
>
> In summary, I do not see any effect of the parallelization.
> Before digging deeper into it and building a minimal working example,
> has someone  already achieved this?
>
> Could one actually directly call the corresponding MKL routine or are the
> internal storage schemes different?
>
> Best regards
> Peter
>
> P.S.: That's what I tried as a diff to the current master branch:
>
> :~/Eigen/eigen/Eigen/src/SparseCore$ git diff SparseSelfAdjointView.h
> diff --git a/Eigen/src/SparseCore/SparseSelfAdjointView.h
> b/Eigen/src/SparseCore/SparseSelfAdjointView.h
> index 0302ef3a4..91e7d495b 100644
> --- a/Eigen/src/SparseCore/SparseSelfAdjointView.h
> +++ b/Eigen/src/SparseCore/SparseSelfAdjointView.h
> @@ -10,6 +10,11 @@
>  #ifndef EIGEN_SPARSE_SELFADJOINTVIEW_H
>  #define EIGEN_SPARSE_SELFADJOINTVIEW_H
>
> +#include <iostream> /// only for debugging
> +#include <boost/range/irange.hpp>
> +
> +#include <execution>
> +
>  #include "./InternalHeaderCheck.h"
>
>  namespace Eigen {
> @@ -295,8 +300,19 @@ inline void
> sparse_selfadjoint_time_dense_product(const SparseLhsType& lhs, cons
>    SparseLhsTypeNested lhs_nested(lhs);
>    LhsEval lhsEval(lhs_nested);
>
> +  Eigen::initParallel();
> +  Index threads = Eigen::nbThreads();
> +
> +  std::cout << "\ndot threads: "<< threads << " rhs-cols: " << rhs.cols()
> << std::endl;
> +
> +  // #pragma omp parallel for
> +  // #pragma omp parallel for
> schedule(dynamic,(rhs.cols()+threads*4-1)/(threads*4)) num_threads(threads)
> +  // for (Index k=0; k<rhs.cols(); ++k)
> +
> +  auto r = boost::irange(rhs.cols());
> +
> +  std::for_each(std::execution::par, r.begin(), r.end(),  [&](Index k)
>    // work on one column at once
> -  for (Index k=0; k<rhs.cols(); ++k)
>    {
>      for (Index j=0; j<lhs.outerSize(); ++j)
>      {
> @@ -330,6 +346,7 @@ inline void
> sparse_selfadjoint_time_dense_product(const SparseLhsType& lhs, cons
>          res.coeffRef(j,k) += alpha * i.value() * rhs.coeff(j,k);
>      }
>    }
> +  );
>  }
>
>
>
>

Reply via email to