Andreas Grassl wrote: > Barry Smith schrieb: >> Hmm, it sounds like the difference between local "ghosted" vectors >> and the global parallel vectors. But I do not understand why any of the >> local vector entries would be zero. >> Doesn't the vector X that is passed into KSP (or SNES) have the global >> entries and uniquely define the solution? Why is viewing that not right? >> > > I still don't understand fully the underlying processes of the whole PCNN > solution procedure, but trying around I substituted > > MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, > gridmapping, &A);
This creates a matrix that is bigger than you want, and gives you the dead values at the end (global dofs that are not in the range of the LocalToGlobalMapping. This from the note on MatCreateIS: | m and n are NOT related to the size of the map, they are the size of the part of the vector owned | by that process. m + nghosts (or n + nghosts) is the length of map since map maps all local points | plus the ghost points to global indices. > by > > MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, gridmapping, > &A); This creates a matrix of the correct size, but it looks like it could easily end up with the "wrong" dofs owned locally. What you probably want to do is: 1. Resolve ownership just like with any other DD method. This partitions your dofs into n owned dofs and ngh ghosted dofs on each process. The global sum of n is N, the size of the global vectors that the solver will interact with. 2. Make an ISLocalToGlobalMapping where all the owned dofs come first, mapping (0..n-1) to (rstart..rstart+n-1), followed by the ghosted dofs (local index n..ngh-1) which map to remote processes. (rstart is the global index of the first owned dof) One way to do this is to use MPI_Scan to find rstart, then number all the owned dofs and scatter the result. The details will be dependent on how you store your mesh. (I'm assuming it's unstructured, this step is trivial if you use a DA.) 3. Call MatCreateIS(comm,n,n,PETSC_DECIDE,PETSC_DECIDE,mapping,&A); > Furthermore it seems, that the load balance is now better, although I still > don't reach the expected values, e.g. > ilu-cg 320 iterations, condition 4601 > cg only 1662 iterations, condition 84919 > > nn-cg on 2 nodes 229 iterations, condition 6285 > nn-cg on 4 nodes 331 iterations, condition 13312 > > or is it not to expect, that nn-cg is faster than ilu-cg? It depends a lot on the problem. As you probably know, for a second order elliptic problem with exact subdomain solves, the NN preconditioned operator (without a coarse component) has condition number that scales as (1/H^2)(1 + log(H/h))^2 where H is the subdomain diameter and h is the element size. In contrast, overlapping additive Schwarz is 1/H^2 and block Jacobi is 1/(Hh) (the original problem was 1/h^2) In particular, there is no reason to expect that NN is uniformly better than ASM, although it may be for certain problems. When a coarse solve is used, NN becomes (1 + log(H/h))^2 which is quasi-optimal (these methods are known as BDDC, which is essentially equivalent to FETI-DP). The key advantage over multigrid (or multilivel Schwarz) is improved robustness with variable coefficients. My understanding is that PCNN is BDDC, and uses direct subdomain solves by default, but I could have missed something. In particular, if the coarse solve is missing or inexact solves are used, you could easily see relatively poor scaling. AFAIK, it's not for vector problems at this time. Good luck. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090603/e2be694d/attachment.pgp>
