Yikes, forget about bit flags and names. 

  Does this behavior make sense? EVERY CUDA vector allocates memory on both GPU 
and CPU ? Or do I misunderstand the code?

   This seems fundamentally wrong and is different than before. What about the 
dozens of work vectors on the GPU (for example for Krylov methods)? There is no 
reason for them to have memory allocated on the CPU.  In the long run pretty 
much all the matrices and vectors will only reside on the GPU so this seems 
like a step backwards. Does libaxb do this? 


   Barry





> On Oct 1, 2019, at 10:24 PM, Zhang, Junchao via petsc-dev 
> <[email protected]> wrote:
> 
> Stafano recently modified the following code,
> 
> 
> PetscErrorCode VecCreate_SeqCUDA(Vec V)
> 
> {
> 
>   PetscErrorCode ierr;
> 
> 
> 
>   PetscFunctionBegin;
> 
>   ierr = PetscLayoutSetUp(V->map);CHKERRQ(ierr);
> 
>   ierr = VecCUDAAllocateCheck(V);CHKERRQ(ierr);
> 
>   ierr = 
> VecCreate_SeqCUDA_Private(V,((Vec_CUDA*)V->spptr)->GPUarray_allocated);CHKERRQ(ierr);
> 
>   ierr = VecCUDAAllocateCheckHost(V);CHKERRQ(ierr);
> 
>   ierr = VecSet(V,0.0);CHKERRQ(ierr);
> 
>   ierr = VecSet_Seq(V,0.0);CHKERRQ(ierr);
> 
>   V->valid_GPU_array = PETSC_OFFLOAD_BOTH;
> 
>   PetscFunctionReturn(0);
> 
> }
> 
> 
> 
> 
> That means if one creates an SEQCUDA vector V and then immediately tests if 
> (V->valid_GPU_array
>  == PETSC_OFFLOAD_GPU), the test will fail. That is
> 
> counterintuitive.  I think we should have
> 
> 
> 
> 
> enum 
> {PETSC_OFFLOAD_UNALLOCATED=0x0,PETSC_OFFLOAD_GPU=0x1,PETSC_OFFLOAD_CPU=0x2,PETSC_OFFLOAD_BOTH=0x3}
>  
> 
> 
> 
> 
> 
> and then use if (V->valid_GPU_array & PETSC_OFFLOAD_GPU). What do you think?
> 
> 
> 
> --Junchao Zhang

Reply via email to