I like to pass in "-cl-nv-verbose" for compilation on nvidia cards. Also, I pass in parameters, for ex "-D Nx=128 -D Ny=128". I'll look at the ViennaCL api.
Thanks, Mani On Tue, Jan 21, 2014 at 3:28 AM, Karl Rupp <[email protected]> wrote: > Hi Mani, > > > > I have a few questions regarding the usage of Viennacl in Petsc. > >> >> 1) In the residual evaluation function: >> >> PetscErrorCode ComputeResidual(TS ts, >> PetscScalar t, >> Vec X, Vec dX_dt, >> Vec F, void *ptr) >> { >> DM da; >> Vec localX; >> TSGetDM(ts, &da) >> DMGetLocalVector(da, &localX); >> >> DMGlobalToLocalBegin(da, X, INSERT_VALUES, localX); >> DMGlobalToLocalEnd(da, X, INSERT_VALUES, localX); >> >> viennacl::vector<PetscScalar> *x, *f; >> VecViennaCLGetArrayWrite(localX, &x); >> VecViennaCLGetArrayRead(F, &f); >> >> viennacl::ocl::enqueue(myKernel(*x, *f)); >> //Should it be viennacl::ocl::enqueue(myKernel(x, f))? >> > > It should be viennacl::ocl::enqueue(myKernel(*x, *f)); > Usually you also want to pass the sizes to the kernel. Don't forget to > cast the sizes to the correct types (e.g. cl_uint). > > > > VecViennaCLRestoreArrayWrite(localX, &x); >> VecViennaCLRestoreArrayRead(F, &f); >> DMRestoreLocalVector(da, &localX); >> } >> >> Will the residual evaluation occur on the GPU/accelerator depending on >> where we choose the ViennaCL array computations to occur? As I >> understand, if we simply use VecGetArray in the residual evaluation >> function, then the residual evaluation is still done on the CPU even >> though the solves are done on the GPU. >> > > If you use VecViennaCLGetArrayWrite(), the data will be valid on the GPU, > so your residual evaluation should happen in the OpenCL kernel you provide. > This is already the case in the code snippet above. > > > > 2) How does one choose on which device the ViennaCL array computations >> will occur? I was looking for some flags like -viennacl >> cpu/gpu/accelerator but could not find any in -help. >> > > Use one out of > -viennacl_device_cpu > -viennacl_device_gpu > -viennacl_device_accelerator > > > > 3) How can one pass compiler flags when building OpenCL kernels in >> ViennaCL? >> > > You could do that through the ViennaCL API directly, but I'm not sure > whether you really want to do this. Which flags do you want to set? My > experience is that these options have little to no effect on performance, > particularly for the memory-bandwidth-limited case. This is also the reason > why I haven't provided a PETSc routine for this. > > Best regards, > Karli > >
