On 6/7/11 1:43 PM, Jed Brown wrote: > On Tue, Jun 7, 2011 at 19:05, Boyce Griffith <griffith at cims.nyu.edu > <mailto:griffith at cims.nyu.edu>> wrote: > > Another might be easier access to CUDA. > > > What do SAMRAI matrices look like? In particular, how are they constructed?
There is not a notion of matrices in SAMRAI --- only vectors. (I believe that SAMRAI was designed for and is still used, at least at LLNL, mainly for hyperbolic problems, so they never have to solve large systems of equations.) I'm not familiar with the details of Bobby's PFLOTRAN SAMRAI implementation, but in my code everything is matrix free, including preconditioners. At one point, I tried implementing the local part of the matrix-free matrices use PETSc Mat's, but performance was modestly lower than a pure Fortran implementation, and so I abandoned this. This was for a cell-centered Poisson problem using a 5/7-point finite-volume stencil. I think this even held in the case where there was only one patch per processor --- i.e., contiguous local storage --- although I may be misremembering. In the past, I also used to build local patch-based Mat's to use with MatRelax within SAMRAI-based FAC preconditioners, but found that it was modestly faster to use a specialized Fortran implementation of Gauss-Seidel. One place where using PETSc Mat's to provide the local part of the matrix seems to win is for Vanka-like smoothers for incompressible Stokes. In this case, performing local solves using PETSc seems to be faster than whatever I cobbled together. > Your mention of CUDA offers a different approach. Instead of using the > non-contiguous storage directly, you could copy it to contiguous > storage. Thus you can make a Vec type (or adorn an existing Vec) with > SAMRAI information. Then you would have an interface with two functions > > VecSAMRAIGetVector(Vec,SAMRAIVector*); > VecSAMRAIRestoreVector(Vec,SAMRAIVector*); Getting this right can be a lot of work, because parallel AMR data indexing is not so simple for data centerings other than cell-centered, but I have started to do something like this in some places where I need to be able to compute explicitly matrices that look like P^t A P. An alternative, which I had in mind when I said that I was interested in trying to use PETSc to provide the storage for the SAMRAI data structures, is to use SAMRAI to provide basic grid management, and to use PETSc to provide data storage. One would allocate appropriately sized PETSc vectors and set various pointers appropriately so that the SAMRAI data would continue to "look like" regular SAMRAI data. Because the data is actually stored in a PETSc Vec, it would also "look like" regular PETSc data from the standpoint of a PETSc solver. Doing this would mean including some redundant ghost cell data in the PETSc Vec that provides the actual storage. There are some cases where retaining such redundant data in the Vec might be a bad thing, and for such cases, it ought to be relatively straightforward to map the "ghosted" Vec to a "nonghosted" Vec that is actually used in the solver. > The first of these would actually create SAMRAI storage if it didn't > exist and copy from there from the Vec. If the SAMRAIVector is modified, > a flag is set in the Vec indicating that the copy "on the SAMRAI" is up > to date. If the Vec is accessed on the PETSc side via VecGetArray() or > similar, the data would be copied over. If it's modified on the PETSc > side, the SAMRAI copy would be marked as invalid. > > Reasons for this instead of using a single copy in discontiguous storage: > > 1. All PETSc solver components would work. > > 2. Some operations will be faster using contiguous storage. > > 3. You can easily have a run-time option to switch. > > The downside is possibly using more memory, but in most use cases, only > a couple Vecs are actually touched by the application. Most of the Vecs > are in the Krylov space or stored time steps. Those would be lighter and > faster using contiguous storage (PETSc native), and since > VecSAMRAIGetVector() would only be called on the ones the user interacts > directly with, it probably only costs a couple vectors to maintain the > multiple copies (which could mean less storage in total when summed over > the whole Krylov space). I am already sort-of heading in this direction, although it is pretty low on my TODO list right now. > What do SAMRAI matrices look like? What is the assembly code like? Can > we preserve that API and get the data to go directly into a PETSc format? No, we can't, but only because there is not really any API there to preserve. ;-) (Again, speaking only for my code, not Bobby's.) -- Boyce