>
>
> 3) Is comparison between pointers appropriate? For example if (dptr !=
> zarray) { is scary if some arrays are zero length how do we know what the
> pointer value will be?
>
>
Yes, you need to consider these cases, which is kind of error prone.
Also, I think merging transpose,and not,is a go
Fande,
I ran your code with two processes and found the poor performance of
PetscSortIntWithArrayPair() was due to duplicates. In particular, rank 0 has
array length = 0 and rank 1 has array length = 4,180,070. On rank 1, each
unique array value has ~95 duplicates; The duplicates are already
My concern is
1) is it actually optimally efficient for all cases? This kind of stuff, IMHO
if (yy) {
if (dptr != zarray) {
ierr = VecCopy_SeqCUDA(yy,zz);CHKERRQ(ierr);
} else if (zz != yy) {
ierr = VecAXPY_SeqCUDA(zz,1.0,yy);CHKERRQ(ierr);
}
} else i
Yea, I agree. Once this is working, I'll go back and split MatMultAdd, etc.
On Wed, Jul 10, 2019 at 11:16 AM Smith, Barry F. wrote:
>
>In the long run I would like to see smaller specialized chunks of code
> (with a bit of duplication between them) instead of highly overloaded
> routines lik
In the long run I would like to see smaller specialized chunks of code (with
a bit of duplication between them) instead of highly overloaded routines like
MatMultAdd_AIJCUSPARSE. Better 3 routines, for multiple alone, for multiple add
alone and for multiple add with sparse format. Trying to
Thanks, you made several changes here, including switches with the
workvector size. I guess I should import this logic to the transpose
method(s), except for the yy==NULL branches ...
MatMult_ calls MatMultAdd with yy=0, but the transpose version have their
own code. MatMultTranspose_SeqAIJCUSPARS
On Wed, Jul 10, 2019 at 1:13 AM Smith, Barry F. wrote:
>
> ierr = VecGetLocalSize(xx,&nt);CHKERRQ(ierr);
> if (nt != A->rmap->n)
> SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_ARG_SIZ,"Incompatible partition of A
> (%D) and xx (%D)",A->rmap->n,nt);
> ierr = VecScatterInitializeForGPU(a->Mvctx,xx);CHK