Steve On Monday, October 29, 2018 at 10:43:48 AM UTC-4, Stephen DeWitt wrote: > > So far I've read through the "CUDA Support" issue > <https://github.com/dealii/dealii/issues/7037>, the "Roadmap for > inclusion of GPU implementation of matrix free in Deal.II" issue > <https://github.com/dealii/dealii/issues/2351>, the Doxygen documentation > for classes in the CUDAWrappers namespace > <https://www.dealii.org/9.0.0/doxygen/deal.II/group__CUDAWrappers.html>, > and the manuscript by Karl Ljungkvist > <http://scs.org/wp-content/uploads/2017/06/4_Final_Manuscript-1.pdf>. Are > there any other pages I should be looking at? > No that's pretty much it.
My understanding from these pages is that deal.II has partial support for > using CUDA with matrix free calculations. Currently, calculations can be > done with scalar variables (but not vector variables) and adaptive meshes. > That's correct with the weird exception that you cannot impose constraints (and thus have hanging nodes) if dim equals 2 and fe_degree is even. There is a bug in that case but we don't know why. A few (somewhat inter-related) questions: > 1). Do all of the tools exist to create a GPU version of step-48? Has > anyone done so? > I think so but nobody as tried so I am not 100% sure. Note that we don't have MPI support yet. > 2). What exactly would be involved in creating a GPU version of step-48? > Is it just changing the CPU Vector, MatrixFree, and FEEvaluation classes to > their GPU counterparts, plus packaging some data (plus local apply > functions?) into a CUDAWrappers::MatrixFree< dim, Number >::Data > <https://www.dealii.org/9.0.0/doxygen/deal.II/structCUDAWrappers_1_1MatrixFree_1_1Data.html> > struct? > Yes, pretty much. You can take a look at the matrix_free_matrix_vector tests here <https://github.com/dealii/dealii/tree/master/tests/cuda>. These tests are doing a matrix-vector multiplication using an Helmholtz operator. As you can see here <https://github.com/dealii/dealii/blob/master/tests/cuda/matrix_vector_mf.h> the main difference between the GPU and the CPU code is that the body loop over the quadrature points needs to be encapsulated in a functor when using GPU. The tests are based on the CPU tests matrix_vector here <https://github.com/dealii/dealii/tree/master/tests/matrix_free>. So you should have a good idea of the modifications required to use the GPU code. > 3). Most of the discussions seemed to revolve around linear solves. For > something like step-48 with explicit updates, will the current paradigm > work well? Or would that require shuttling data between the GPU and CPU > every time step, causing too much overhead? (I know that in general GPUs > can work very well for explicit codes.) > That should work fine. Most people are interested in using matrix-free with a solver but I don't see a reason why it would be slow in step 48. The communication between the CPU and the GPU should be limited in step-48. Best, Bruno -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
