Hi Karli, Your list sounds great to me. Glad that you and Paul are working on this together.
My main interests are in better preconditioner support and better multi-GPU/MPI scalability. Is there any progress on Steve Dalton's work on the cusp algebraic multigrid preconditioner with PETSc? I believe Jed said in a previous email that Steve was going to be working on adding MPI support for that as well as other enhancements. Will there be any improvements for GPU preconditioners in ViennaCL 1.5.0? When do you expect ViennaCL 1.5.0 to be available in PETSc? I'm also interested in trying the PETSc ViennaCL support on the Xeon Phi. Do you have a schedule for when that might be ready for friendly testing? Thanks, Dave -- Dave Nystrom LANL HPC-5 Phone: 505-667-7913 Email: [email protected] Smail: Mail Stop B272 Group HPC-5 Los Alamos National Laboratory Los Alamos, NM 87545 ________________________________________ From: [email protected] [[email protected]] on behalf of Karl Rupp [[email protected]] Sent: Friday, July 19, 2013 1:12 PM To: For users of the development version of PETSc Subject: [petsc-dev] Improving and stabilizing GPU support Hi guys, now as the Paul's pull request for largely removing the txpetscgpu dependency is merged to next, I will proceed with further improving our GPU support. My ideas and TODO-list are as follows: * Reduce CUSP dependency: The current elementary operations are mainly realized via CUSP. With better support via CUSPARSE and CUBLAS, I'd add a separate 'native' CUDA backend so that we can provide a full set of vector and sparse matrix operations out of the default NVIDIA toolchain. We will still keep CUSP for its preconditioners, yet we no longer depend on it. * Integrate last bits of txpetscgpu package. I assume Paul will provide a helping hand here. * Better ViennaCL bindings: The OpenCL version of VecMDot() will experience a boost with the ViennaCL 1.5.0 release, the CUDA version was fixed a couple of month back. Also, VecCopySome() will get improved in order to provide better MPI performance (similar to what Paul applied for CUSPARSE) * Documentation: Add a chapter on GPUs to the manual, particularly on what to expect and what not to expect. Update documentation on webpage regarding installation. * Integration of FEM quadrature from SNES ex52. The CUDA part requiring code generation is not very elegant, while the OpenCL approach is better suited for a library integration thanks to JIT. However, this requires user code to be provided as a string (again not very elegant) or loaded from file (more reasonable). How much FEM functionality do we want to provide via PETSc? Please don't hesitate to post other GPU wishes. Now it's the best time for doing so :-) Best regards, Karli
