Stafano recently modified the following code,
PetscErrorCode VecCreate_SeqCUDA(Vec V)
{
PetscErrorCode ierr;
PetscFunctionBegin;
ierr = PetscLayoutSetUp(V->map);CHKERRQ(ierr);
ierr = VecCUDAAllocateCheck(V);CHKERRQ(ierr);
ierr =
VecCreate_SeqCUDA_Private(V,((Vec_CUDA*)V->spptr)->GPUarray_allocated);CHKERRQ(ierr);
ierr = VecCUDAAllocateCheckHost(V);CHKERRQ(ierr);
ierr = VecSet(V,0.0);CHKERRQ(ierr);
ierr = VecSet_Seq(V,0.0);CHKERRQ(ierr);
V->valid_GPU_array = PETSC_OFFLOAD_BOTH;
PetscFunctionReturn(0);
}
That means if one creates an SEQCUDA vector V and then immediately tests if
(V->valid_GPU_array == PETSC_OFFLOAD_GPU), the test will fail. That is
counterintuitive. I think we should have
enum
{PETSC_OFFLOAD_UNALLOCATED=0x0,PETSC_OFFLOAD_GPU=0x1,PETSC_OFFLOAD_CPU=0x2,PETSC_OFFLOAD_BOTH=0x3}
and then use if (V->valid_GPU_array & PETSC_OFFLOAD_GPU). What do you think?
--Junchao Zhang