Dear all,
I hope that this new implementation also improves the scaling of Eigen
solvers, e.g. conjugate gradient and biconjugate gradient. What I usually
get is a factor 2 speed up independently from the number of threads.

Thanks for the effort.
Best regards.
Lorenzo



On Mon, Sep 13, 2021, 17:52 Rasmus Munk Larsen <[email protected]> wrote:

> Hi Peter,
>
> I would recommend that you hold off on this change for a bit. We are
> planning on moving the non-blocking threadpool from unsupported into core
> in the near future and use that as the basis for multi-threading without
> depending on OpenMP.
>
> Rasmus
>
> On Mon, Sep 13, 2021 at 5:21 AM Peter <[email protected]> wrote:
>
>> Dear all,
>>
>> I'm currently playing with the sparse matrix implementation within eigen,
>> namely Eigen::SparseMatrix<double, Eigen::ColMajor> combined with a self
>> adjoint view.
>>
>> In my application I need the multiplication of a symmetric sparse matrix
>> with a dense matrix,
>> where the dimension of the matrices are of the order of up to a few ten
>> thousands.
>>
>> I tried to parallelize the dot product in SparseSelfAdjointView.h by
>> copying a omp directive
>> from other parts in eigen, and I tried to parallelize the outer loop
>> directly
>> with std::execution::par, see below.
>>
>> In summary, I do not see any effect of the parallelization.
>> Before digging deeper into it and building a minimal working example,
>> has someone  already achieved this?
>>
>> Could one actually directly call the corresponding MKL routine or are the
>> internal storage schemes different?
>>
>> Best regards
>> Peter
>>
>> P.S.: That's what I tried as a diff to the current master branch:
>>
>> :~/Eigen/eigen/Eigen/src/SparseCore$ git diff SparseSelfAdjointView.h
>> diff --git a/Eigen/src/SparseCore/SparseSelfAdjointView.h
>> b/Eigen/src/SparseCore/SparseSelfAdjointView.h
>> index 0302ef3a4..91e7d495b 100644
>> --- a/Eigen/src/SparseCore/SparseSelfAdjointView.h
>> +++ b/Eigen/src/SparseCore/SparseSelfAdjointView.h
>> @@ -10,6 +10,11 @@
>>  #ifndef EIGEN_SPARSE_SELFADJOINTVIEW_H
>>  #define EIGEN_SPARSE_SELFADJOINTVIEW_H
>>
>> +#include <iostream> /// only for debugging
>> +#include <boost/range/irange.hpp>
>> +
>> +#include <execution>
>> +
>>  #include "./InternalHeaderCheck.h"
>>
>>  namespace Eigen {
>> @@ -295,8 +300,19 @@ inline void
>> sparse_selfadjoint_time_dense_product(const SparseLhsType& lhs, cons
>>    SparseLhsTypeNested lhs_nested(lhs);
>>    LhsEval lhsEval(lhs_nested);
>>
>> +  Eigen::initParallel();
>> +  Index threads = Eigen::nbThreads();
>> +
>> +  std::cout << "\ndot threads: "<< threads << " rhs-cols: " <<
>> rhs.cols() << std::endl;
>> +
>> +  // #pragma omp parallel for
>> +  // #pragma omp parallel for
>> schedule(dynamic,(rhs.cols()+threads*4-1)/(threads*4)) num_threads(threads)
>> +  // for (Index k=0; k<rhs.cols(); ++k)
>> +
>> +  auto r = boost::irange(rhs.cols());
>> +
>> +  std::for_each(std::execution::par, r.begin(), r.end(),  [&](Index k)
>>    // work on one column at once
>> -  for (Index k=0; k<rhs.cols(); ++k)
>>    {
>>      for (Index j=0; j<lhs.outerSize(); ++j)
>>      {
>> @@ -330,6 +346,7 @@ inline void
>> sparse_selfadjoint_time_dense_product(const SparseLhsType& lhs, cons
>>          res.coeffRef(j,k) += alpha * i.value() * rhs.coeff(j,k);
>>      }
>>    }
>> +  );
>>  }
>>
>>
>>
>>

Reply via email to