> On Dec 27, 2017, at 2:53 PM, Kokron, Daniel S. (ARC-606.2)[CSRA, LLC] 
> <[email protected]> wrote:
> 
> I am looking into ways to improve performance of the VecMDot_Seq routine.  I 
> am focusing on the variant that gets called when PETSC_THREADCOMM_ACTIVE and 
> PETSC_USE_FORTRAN_KERNEL_MDOT are NOT defined.
> 
> My current version of PETSc is 3.4.5 due solely to user requirement.  I am 
> linking against MKL.
> 
> I tried and failed to implement VecMDot_Seq as a call to cblas_dgemv in 
> ~/mpi/pvec2.c
> 
> cblas_dgemv(CblasRowMajor, CblasNoTrans, nv, n, 1., b, n, xbase, 1, 0., work, 
> 1);
> 
> I could not figure out a way to extract the vectors from 'Vec y[]' and store 
> them as rows of an allocated array.

  Use VecGetArray on each one then do a memcpy of those values into the big 
allocated array (or a loop copy or whatever).
> 
> This user post starts off with a similar request (how to construct a matrix 
> from many vectors)
> https://lists.mcs.anl.gov/pipermail/petsc-users/2015-August/026848.html
> 
> I understand that this sort of memory shuffling is expensive.  I was just 
> hoping to prove the point to myself that it's possible.

   Yes the memory shuffling is possible and not that difficult, perhaps you 
just have the rows and columns backwards?

> 
> The action performed by VecMDot_Seq is the same as matrix-vector 
> multiplication, so I was wondering why it wasn't implemented as a call ?gemv?

   The reason is that we want each of the vectors to be independent of the 
other ones; to use gemv they have be shared in a single common array.

   I doubt that copying into a single array and calling the gemv will ever be 
faster. Better to just try to optimize/vectorize better the current code.


> 
> Daniel Kokron
> NASA Ames (ARC-TN)
> SciCon group

Reply via email to