[petsc-users] Problems with the intel compilers
Hi Dominik, Dominik Szczerba wrote: I have noticed already a longer while ago: setting up PETSc with the Intel compilers (--with-cc=icc --with-cxx=icpc) takes an order of magnitude longer than with the native GNU. The step taking most of the time is configuring mpich. I have tried to configure mpich separately and indeed, with gnu it is a breeze, with intel 10x slower. In both cases, linux both 32 and 64 bit, Ubuntu 9.04 and debian testing. Intel compilers 10.x and 11.x. I just wanted to ask opinion if anybody has similar observations and/or finds using intel worthwhile at all. On itanium machines intel compilers produce much faster code compared to gnu-compilers. On other machines I did not notice a big speed difference, but I think intel compilers produce slightly faster code as well. Cheers, ando -- /\ \ / ASCII Ribbon Xagainst HTML email / \ -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091027/dc05c456/attachment.pgp
PaStiX crash
Hi Barry, here again with line numbers: http://pastebin.com/m630324e I noticed, the Pastix-version with only '-g' gives no errors. Hope this output now helps for debugging. Cheers, ando Barry Smith schrieb: I think if you compile all the code (including Scotch) with the -g option as Satish suggested then it should show exact line numbers in the source code where the corruption occurs and you could report it to the Scotch developers. As it is without the line numbers it may be difficult for the Scotch developers to determine the problem. Barry On Oct 21, 2009, at 10:49 AM, Andreas Grassl wrote: Satish Balay schrieb: Perhaps you can try running in valgrind to see here the problem is. You can also try --with-debugging=0 COPTFLAGS='-g -O' - and see if it crashes. If so - run in a debugger to determine the problem. here you find the output of valgrind: http://pastebin.com/m16478dcf It seems the problem is around the scotch library. Trying to substitute the library with the working version from the debugging-branch did not work and I found no options to change the ordering algorithm to e.g. (par)metis installed for mumps Any ideas? Cheers, ando Satish On Tue, 20 Oct 2009, Andreas Grassl wrote: Hello, I wanted to use PaStix and have the problem, that the debugging version works and PETSc compiled with option --with-debugging=0 gives following error: what could be wrong? ++ + PaStiX : Parallel Sparse matriX package + ++ Matrix size 7166 x 7166 Number of nonzeros 177831 ++ + Options + ++ Version : exported SMP_SOPALIN : Defined VERSION MPI : Defined PASTIX_BUBBLE : Not defined STATS_SOPALIN : Not defined NAPA_SOPALIN: Defined TEST_IRECV : Not defined TEST_ISEND : Defined THREAD_COMM : Not defined THREAD_FUNNELED : Not defined TAG : Exact Thread FORCE_CONSO : Not defined RECV_FANIN_OR_BLOCK : Not defined OUT_OF_CORE : Not defined DISTRIBUTED : Not defined FLUIDBOX: Not defined METIS : Not defined INTEGER TYPE: int32_t FLOAT TYPE : double ++ Check : orderingOK Check : Sort CSCOK [0]PETSC ERROR: [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: - Error Message [0]PETSC ERROR: Signal received! [0]PETSC ERROR: [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 CDT 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: [0]PETSC ERROR: ./standalonesolver on a linux32-i named login.leo1 by c702174 Tue Oct 20 11:55:24 2009 [0]PETSC ERROR: Libraries linked from /mnt/x4540/hpc-scratch/c702174/leo1/petsc/petsc-3.0.0-p8/linux32-intel-c-leo1/lib [0]PETSC ERROR: Configure run at Tue Oct 20 00:39:27 2009 [0]PETSC ERROR: Configure options --with-scalar-type=real --with-debugging=0 --with-precision=double --with-shared=0 --with-mpi=1 --with-mpi-dir=/usr/site/hpc/x86_64/glibc-2.5/italy/openmpi/1.3.3/intel-11.0 --with-scalapack=1 --download-scalapack=ifneeded --download-f
PaStiX crash
Satish Balay schrieb: Perhaps you can try running in valgrind to see here the problem is. You can also try --with-debugging=0 COPTFLAGS='-g -O' - and see if it crashes. If so - run in a debugger to determine the problem. here you find the output of valgrind: http://pastebin.com/m16478dcf It seems the problem is around the scotch library. Trying to substitute the library with the working version from the debugging-branch did not work and I found no options to change the ordering algorithm to e.g. (par)metis installed for mumps Any ideas? Cheers, ando Satish On Tue, 20 Oct 2009, Andreas Grassl wrote: Hello, I wanted to use PaStix and have the problem, that the debugging version works and PETSc compiled with option --with-debugging=0 gives following error: what could be wrong? ++ + PaStiX : Parallel Sparse matriX package + ++ Matrix size 7166 x 7166 Number of nonzeros 177831 ++ + Options + ++ Version : exported SMP_SOPALIN : Defined VERSION MPI : Defined PASTIX_BUBBLE : Not defined STATS_SOPALIN : Not defined NAPA_SOPALIN: Defined TEST_IRECV : Not defined TEST_ISEND : Defined THREAD_COMM : Not defined THREAD_FUNNELED : Not defined TAG : Exact Thread FORCE_CONSO : Not defined RECV_FANIN_OR_BLOCK : Not defined OUT_OF_CORE : Not defined DISTRIBUTED : Not defined FLUIDBOX: Not defined METIS : Not defined INTEGER TYPE: int32_t FLOAT TYPE : double ++ Check : orderingOK Check : Sort CSCOK [0]PETSC ERROR: [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: - Error Message [0]PETSC ERROR: Signal received! [0]PETSC ERROR: [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 CDT 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: [0]PETSC ERROR: ./standalonesolver on a linux32-i named login.leo1 by c702174 Tue Oct 20 11:55:24 2009 [0]PETSC ERROR: Libraries linked from /mnt/x4540/hpc-scratch/c702174/leo1/petsc/petsc-3.0.0-p8/linux32-intel-c-leo1/lib [0]PETSC ERROR: Configure run at Tue Oct 20 00:39:27 2009 [0]PETSC ERROR: Configure options --with-scalar-type=real --with-debugging=0 --with-precision=double --with-shared=0 --with-mpi=1 --with-mpi-dir=/usr/site/hpc/x86_64/glibc-2.5/italy/openmpi/1.3.3/intel-11.0 --with-scalapack=1 --download-scalapack=ifneeded --download-f-blas-lapack=ifneeded --with-blacs=1 --download-blacs=ifneeded --with-parmetis=1 --download-parmetis=ifneeded --with-mumps=1 --download-mumps=ifneeded --with-spooles=1 --download-spooles=ifneeded --with-superlu_dist=1 --download-superlu_dist=ifneeded --with-scotch=1 --download-scotch=ifneeded --with-pastix=1 --download-pastix=ifneeded --with-umfpack=1 --download-umfpack=ifneeded PETSC_ARCH=linux32-intel-c-leo1 [0]PETSC ERROR: [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file Cheers, ando
PaStiX crash
Hello, I wanted to use PaStix and have the problem, that the debugging version works and PETSc compiled with option --with-debugging=0 gives following error: what could be wrong? ++ + PaStiX : Parallel Sparse matriX package + ++ Matrix size 7166 x 7166 Number of nonzeros 177831 ++ + Options + ++ Version : exported SMP_SOPALIN : Defined VERSION MPI : Defined PASTIX_BUBBLE : Not defined STATS_SOPALIN : Not defined NAPA_SOPALIN: Defined TEST_IRECV : Not defined TEST_ISEND : Defined THREAD_COMM : Not defined THREAD_FUNNELED : Not defined TAG : Exact Thread FORCE_CONSO : Not defined RECV_FANIN_OR_BLOCK : Not defined OUT_OF_CORE : Not defined DISTRIBUTED : Not defined FLUIDBOX: Not defined METIS : Not defined INTEGER TYPE: int32_t FLOAT TYPE : double ++ Check : orderingOK Check : Sort CSCOK [0]PETSC ERROR: [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: - Error Message [0]PETSC ERROR: Signal received! [0]PETSC ERROR: [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 8, Fri Aug 21 14:02:12 CDT 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: [0]PETSC ERROR: ./standalonesolver on a linux32-i named login.leo1 by c702174 Tue Oct 20 11:55:24 2009 [0]PETSC ERROR: Libraries linked from /mnt/x4540/hpc-scratch/c702174/leo1/petsc/petsc-3.0.0-p8/linux32-intel-c-leo1/lib [0]PETSC ERROR: Configure run at Tue Oct 20 00:39:27 2009 [0]PETSC ERROR: Configure options --with-scalar-type=real --with-debugging=0 --with-precision=double --with-shared=0 --with-mpi=1 --with-mpi-dir=/usr/site/hpc/x86_64/glibc-2.5/italy/openmpi/1.3.3/intel-11.0 --with-scalapack=1 --download-scalapack=ifneeded --download-f-blas-lapack=ifneeded --with-blacs=1 --download-blacs=ifneeded --with-parmetis=1 --download-parmetis=ifneeded --with-mumps=1 --download-mumps=ifneeded --with-spooles=1 --download-spooles=ifneeded --with-superlu_dist=1 --download-superlu_dist=ifneeded --with-scotch=1 --download-scotch=ifneeded --with-pastix=1 --download-pastix=ifneeded --with-umfpack=1 --download-umfpack=ifneeded PETSC_ARCH=linux32-intel-c-leo1 [0]PETSC ERROR: [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file Cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
SBAIJ issue
Hong Zhang schrieb: Ando, I do not see any error message from attached info below. Even '-log_summary' gives correct display. I guess you sent us the working output (np=2). I have attached 3 files. The one you found with -log_summary printed is indeed the working scenario. The other 2 are hanging. Output of top for np=4 when still running: 8466 csae1801 25 0 1442m 704m 5708 R 100 5.9 1:20.87 externalsolver 8468 csae1801 25 0 1413m 697m 5052 R 100 5.8 1:13.45 externalsolver 8469 csae1801 25 0 1359m 614m 5148 R 100 5.1 1:12.75 externalsolver 8467 csae1801 25 0 1415m 702m 5096 R 96 5.9 1:13.01 externalsolver Output of top for np=4 when hanging: 8466 csae1801 18 0 1443m 769m 6120 S0 6.4 2:09.47 externalsolver 8468 csae1801 15 0 1413m 759m 5420 S0 6.3 2:00.87 externalsolver 8467 csae1801 15 0 1415m 748m 5396 S0 6.2 2:01.21 externalsolver 8469 csae1801 18 0 1359m 688m 5460 S0 5.7 2:01.39 externalsolver other processes use about 12% memory in sum. I would suggest you run your code with debugger, e.g., '-start_in_debugger'. When it hangs, type Control-C, and type 'where' to check where it hangs. I guess it is hanging somewhere after the numerical factorization because the extrapolated time would match. Using debug-version or nondebug doesn't change the behaviour Output from where (using gdb): #0 0x003a0ccc5cdf in poll () from /lib64/libc.so.6 #1 0x011d1024 in MPIDU_Sock_wait (sock_set=0x4464890, millisecond_timeout=4, eventp=0x) at sock_wait.i:124 #2 0x011a3203 in MPIDI_CH3I_Progress (blocking=71714960, state=0x4) at ch3_progress.c:1038 #3 0x011843ce in PMPI_Recv (buf=0x4464890, count=4, datatype=-1, source=-1, tag=108517088, comm=168072704, status=0x4f503b0) at recv.c:156 #4 0x00ea9926 in BI_Srecv (ctxt=0x4f522d0, src=-2, msgid=2, bp=0x1813ad8) at BI_Srecv.c:8 #5 0x00ea9414 in BI_SringBR (ctxt=0x4f522d0, bp=0x1813ad8, send=0xea9800 BI_Ssend, src=1) at BI_SringBR.c:16 #6 0x00ea22b1 in igebr2d_ (ConTxt=0x7fff0afeb110, scope=0x12a57f8 Rowwise, top=0x17b9094 S, m=0x12a57b8, n=0x12a57b8, A=0x7fff0afeb560, lda=0x12a57b8, rsrc=0x7fff0afeb118, csrc=0x7fff0afeb090) at igebr2d_.c:198 #7 0x00e3b0f5 in pdpotf2 (uplo=Invalid C/C++ type code 13 in symbol table. ) at pdpotf2.f:340 #8 0x00e2c818 in pdpotrf (uplo=Invalid C/C++ type code 13 in symbol table. ) at pdpotrf.f:327 #9 0x00c5daf6 in dmumps_146 (myid=0, root= {mblock = 48, nblock = 48, nprow = 2, npcol = 2, myrow = 0, mycol = 0, root_size = 2965, tot_root_size = 2965, cntxt_blacs = 0, rg2l_row = 0x676f0bf, rg2l_col = 0x676f107, ipiv = 0x676f14f, descriptor = {1, 0, 2965, 2965, 48, 48, 0, 0, 1488}, descb = {0, 0, 0, 0, 0, 0, 0, 0, 0}, yes = 4294967295, gridinit_done = 4294967295, lpiv = 1, schur_pointer = 0x676f1eb, schur_mloc = 0, schur_nloc = 0, schur_lld = 0, qr_tau = 0x676f23f, qr_rcond = 0, maxg = 0, gind = 0, grow = 0x676f297, gcos = 0x676f2df, gsin = 0x676f327, elg_max = 0, null_max = 0, elind = 0, euind = 0, nlupdate = 0, nuupdate = 0, perm_row = 0x676f387, perm_col = 0x676f3cf, elrow = 0x676f417, eurow = 0x676f45f, ptrel = 0x676f4a7, ptreu = 0x676f4ef, elelg = 0x676f537, euelg = 0x676f57f, dl = 0x676f5c7}, n=446912, iroot=266997, comm=-2080374780, iw=0x2aaaf5c49010, liw=8275423, ifree=1646107, a=0x2aaab9d6e010, la=125678965, ptrast=0xb09d2fc, ptlust_s=0xb05d200, ptrfac=0xb0727b0, step=0xb4aca20, info={0, 0}, ldlt=1, qr=0, wk=0x2aaacab98ba0, lwk=90267651, keep= {8, 2571, 96, 24, 16, 48, 150, 120, 400, 6875958, 2147483646, 200, 3015153, 3259551, 1655023, 0, 0, 0, 0, 0, 0, 0, 0, 18, 0, 1646982, 3705, 21863, 8275423, 0, 0, 0, 0, 4, 8, 1, 800, 266997, 16, -456788, 8, 0, 190998, 190998, 0, 1, 2, 5, 12663, 1, 48, 0, 0, 3, 0, 5, 500, 250, 0, 0, 0, 100, 60, 10, 120, 28139, 84754429, 0, 1, 0, 21863, 0, 0, 0, 1, 2, 30, 0, 2147483647, 1, 0, 5, 4, -8, 100, 1, 70, 70, 0, 1, 4, 0, 0, 0, 1, 0, 0, 0, 4, 1200, 8791225, 150, 0, 16, 0, 1, 0, 1370, 0, 0, 0, 0, 11315240, 12209064, 0 repeats 11 times, 6167135, 3705, 0 repeats 74 times, 2214144, 0, 0, 0, 0, 0, 0, -1, 2, 2, 2214144, 201, 2, 0, 1, 0, 50, 1, 0, 0, 5, 2291986, 1670494, 1678547, 142320, 32, 0, 0, 0, 1, 3, 0, 1, 0, 0, 0, 12, 1, 10, 0 repeats 260 times}, keep8= {0, 407769668, 177587312, 0, 0, 0, 0, 0, 31341437, 30351541, 35301388, 41892965, 125678965, 12496233, 574564, 0, 37488833, 0 repeats 91 times, 120657071, 0, 137362626, 0 repeats 39 times}) at dmumps_part7.F:286 #10 0x00c17921 in dmumps_251 (n=446912, iw=0x2aaaf5c49010, liw=8275423, a=0x2aaab9d6e010, la=125678965, nstk_steps=0xb0dd3d0, nbprocfils=0xb0f296c, iflag=0, nd=0x4dbe8f0, fils=0xb661130, step=0xb4aca20, frere=0x4dd3ea0, dad=0x4de9450, cand=0x6a24830, istep_to_iniv2=0x4dfea00, tab_pos_in_pere=0x67bbff0, maxfrt=0, ntotpv=0, ptrist=0xb087d60, ptrast=0xb09d2fc,
SBAIJ issue
Barry Smith schrieb: Perhaps you have done enough memory preallocation for the multiprocessors and it is going very--very slowly with memory allocations? do you mean this line MatMPISBAIJSetPreallocation(A, bs,nz, PETSC_NULL,nz, PETSC_NULL) I set bs to 1 and nz to 77 or do you mean the icntl(14) option of MUMPS? Increasing to 2000 allows to complete the run with np=4. on a even larger problem (~180 DOF) I now get by setting the icntl(23) option to a reasonable value a full run. So it seems it is a MUMPS-finetuning-problem!? The problem with -ksp_view_binary is persisting. Is it possible that MUMPS is working much faster and load balancing better if it has a vaste amount of memory available? Any general advices beside switching to a larger machine? Cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
SBAIJ issue
Hong Zhang schrieb: I would suggest you run your code with debugger, e.g., '-start_in_debugger'. When it hangs, type Control-C, and type 'where' to check where it hangs. The debugger output of the ongoing process when given option -ksp_view_binary with mumps_cholesky. It is hanging after solving. Explanations and suggestions? Cheers, ando Program received signal SIGINT, Interrupt. [Switching to Thread 46912507935936 (LWP 11965)] 0x2b00eb8a in __intel_new_memset () from /opt/intel/Compiler/11.0/074/lib/intel64/libirc.so (gdb) bt #0 0x2b00eb8a in __intel_new_memset () from /opt/intel/Compiler/11.0/074/lib/intel64/libirc.so #1 0x2afecb66 in _intel_fast_memset.J () from /opt/intel/Compiler/11.0/074/lib/intel64/libirc.so #2 0x00aefefe in PetscMemzero (a=0x14b206c0, n=8051408) at memc.c:205 #3 0x00ab41d0 in PetscTrFreeDefault (aa=0x14b206c0, line=89, function=0x1260b10 MatSeqXAIJFreeAIJ, file=0x1260840 /home/lux/csae1801/petsc/petsc-3.0.0-p8/include/../src/mat/impls/aij/seq/aij.h, dir=0x1260ad4 src/mat/impls/sbaij/mpi/) at mtr.c:318 #4 0x008f606e in MatSeqXAIJFreeAIJ (AA=0x143112a0, a=0x143124c8, j=0x143124b8, i=0x143124b0) at /home/lux/csae1801/petsc/petsc-3.0.0-p8/include/../src/mat/impls/aij/seq/aij.h:89 #5 0x008f773e in MatSetValues_MPISBAIJ (mat=0x133aaf30, m=1, im=0x15cc2f70, n=43, in=0x15f6fa10, v=0x16cfb938, addv=NOT_SET_VALUES) at mpisbaij.c:202 #6 0x008fc0a9 in MatAssemblyEnd_MPISBAIJ (mat=0x133aaf30, mode=MAT_FINAL_ASSEMBLY) at mpisbaij.c:539 #7 0x00633e5e in MatAssemblyEnd (mat=0x133aaf30, type=MAT_FINAL_ASSEMBLY) at matrix.c:4561 #8 0x008fe302 in MatView_MPISBAIJ_ASCIIorDraworSocket (mat=0x11df00e0, viewer=0x133a4070) at mpisbaij.c:704 #9 0x008fe95c in MatView_MPISBAIJ (mat=0x11df00e0, viewer=0x133a4070) at mpisbaij.c:733 #10 0x00603570 in MatView (mat=0x11df00e0, viewer=0x133a4070) at matrix.c:643 #11 0x004c9962 in KSPSolve (ksp=0x11f3ed80, b=0x116650a0, x=0x115fe9b0) at itfunc.c:328 #12 0x0040a5ff in main (argc=1, argv=0x7fff3baade68) at externalsolver.c:590 (gdb) c Continuing. Program received signal SIGINT, Interrupt. 0x2b00adf3 in __intel_new_memcpy () from /opt/intel/Compiler/11.0/074/lib/intel64/libirc.so (gdb) bt #0 0x2b00adf3 in __intel_new_memcpy () from /opt/intel/Compiler/11.0/074/lib/intel64/libirc.so #1 0x2afecb16 in _intel_fast_memcpy.J () from /opt/intel/Compiler/11.0/074/lib/intel64/libirc.so #2 0x00aef6f5 in PetscMemcpy (a=0x14b2abc0, b=0x156bed20, n=3721504) at memc.c:102 #3 0x008f74b7 in MatSetValues_MPISBAIJ (mat=0x133aaf30, m=1, im=0x15cc7964, n=44, in=0x15f74404, v=0x16d04d20, addv=NOT_SET_VALUES) at mpisbaij.c:202 #4 0x008fc0a9 in MatAssemblyEnd_MPISBAIJ (mat=0x133aaf30, mode=MAT_FINAL_ASSEMBLY) at mpisbaij.c:539 #5 0x00633e5e in MatAssemblyEnd (mat=0x133aaf30, type=MAT_FINAL_ASSEMBLY) at matrix.c:4561 #6 0x008fe302 in MatView_MPISBAIJ_ASCIIorDraworSocket (mat=0x11df00e0, viewer=0x133a4070) at mpisbaij.c:704 #7 0x008fe95c in MatView_MPISBAIJ (mat=0x11df00e0, viewer=0x133a4070) at mpisbaij.c:733 #8 0x00603570 in MatView (mat=0x11df00e0, viewer=0x133a4070) at matrix.c:643 #9 0x004c9962 in KSPSolve (ksp=0x11f3ed80, b=0x116650a0, x=0x115fe9b0) at itfunc.c:328 #10 0x0040a5ff in main (argc=1, argv=0x7fff3baade68) at externalsolver.c:590 (gdb) -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
SBAIJ issue
Hello, I'm trying to work with MUMPS-Cholesky as direct solver and CG as iterative solver and have problems with SBAIJ-Matrices. I have a MatSetValues-line not regarding the symmetry but force it by the command line option -mat_ignore_lower_triangular. The problems show up at bigger problems (~45 DOF) with more processes (1 and 2 work fine, 4 gives a hangup) and/or trying to save the matrix. The machine has 4 cores and 12 GB memory. NNZ per row is ~80. Attached you find the output of the program run with -info. Any hints where to search? Cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 -- next part -- An embedded and charset-unspecified text was scrubbed... Name: 2procinfo URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091012/cdf0afd0/attachment-0003.diff -- next part -- An embedded and charset-unspecified text was scrubbed... Name: 2procinfokspviewbinary URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091012/cdf0afd0/attachment-0004.diff -- next part -- An embedded and charset-unspecified text was scrubbed... Name: 4procinfo URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20091012/cdf0afd0/attachment-0005.diff
convergence monitoring
Barry Smith schrieb: On Oct 5, 2009, at 11:09 AM, Andreas Grassl wrote: Is there an easy (without hacking the PETSc sources) way to output a customized convergence monitoring line like -ksp_monitor_true_residual at every step? What do you want to do? I cannot understand your question. You can use KSPMonitorSet() to provide any function you want to display residuals anyway you want. This is the function I was searching. Surfing through the various routines I was overlooking the short description and concentrated here only on the option keys :) Cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
convergence monitoring
Hello, I want to monitor the convergence of the KSP-solver, i.e. plot the number of iterations vs. the error/residual norm. I discovered the options -ksp_monitor_draw_true_residual and -ksp_monitor_true_residual. Now the questions: What does the grafical output of -ksp_monitor_draw_true_residual represent? I see the iteration count at the x-axis and expected the norm of the residual at the y-axis, and some scaled value at the y-axis. Is this the logarithm of the residual/which residual? If I output to .ps, I get overlayed the steps and don't see anything useful at the end. Is there a way to extract only the last picture? Is there an easy (without hacking the PETSc sources) way to output a customized convergence monitoring line like -ksp_monitor_true_residual at every step? Cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
strange behaviour with PetscViewerBinary on MATIS
Hello, I want to save my Matrix A to disk and process it then with ksp/ksp/ex10. Doing it for type AIJ is working fine. Using type IS, it seems to save only the local matrix from one processor to the disk and dump the others to stdout. PetscViewerBinaryOpen(commw,matrix.bin,FILE_MODE_WRITE,viewer1); MatView(A,viewer1); Is the only workaround to save the LocalToGlobalMapping and the local matrices separately and to read in all this information or do you see an easier way? Is there a canonical way to save and restore the LocalToGlobalMapping? Cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
strange behaviour with PetscViewerBinary on MATIS
Jed Brown schrieb: You can put this in MatView_IS if you really need it, but I doubt it will actually be useful. Unfortunately, you cannot change the domain decomposition with Neumann preconditioners, hence they will have limited use for solving a system with a saved matrix. Why do you want to save the matrix, it's vastly slower and less useful than a function which assembles that matrix? I assemble the Matrix by reading out from a data structure produced by a proprietary program and just used this easy approach to compare the solvers on different machines, where this program is not installed. Since the implementation of the NN-preconditioner is suboptimal at all, I will not waste much time on this issues and my post at the list was lead mostly by my curiosity. thanks for the explanation cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
Mumps speedup by passing symmetric matrix
Hello, solved the 64-32-bit-issue, i have working now MUMPS and gain reasonable results, but I'm wondering if I could see some performance increasing by using the symmetry of the matrix. By setting only the option -mat_mumps_sym I don't see any changes in runtime and INFOG(8) returns 100. Setting MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE) I don't see any changes either. Does MUMPS recognize and use automatically the symmetry? Cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090715/2f52d25c/attachment.pgp
ifort -i8 -r8 options
Barry Smith schrieb: Mumps and hypre do not currently support using 32 bit integer indices, ^^ here you mean 64?! though the MUMPS folks say they plan to support it eventually. Changing PETSc to convert all 64 bit integers to 32 bit before passing to MUMPS and hypre is a huge project and we will not be doing that. You need to lobby the MUMPS and hypre to properly support 64 bit integers if you want to use them in that mode. Unless you are solving very large problems it seems you should be able to use the -r8 flag but not the -i8 flag. For my needs, this is certainly true, but I don't have the whole sourcecode and I am not able to get a working Diana-program if I omit the -i8 flag. so you suggest casting the input data from Diana to PetscInt which is defined 32-bit?! Cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
PCNN-preconditioner and floating domains
Hello, Barry Smith schrieb: To modify the code for elasticity is a big job. There are two faculty Axel Klawonn and O. Rheinbach at Universit?t Duisburg-Essen who have implemented a variety of these fast methods for elasticity. I suggest you contact them. I wrote an email last week, but I still have no answer. Investigating further my code and the source code related to IS I noticed, that there is a flag pure_neumann which I guess should handle the singular Neumann problems giving the problem, but from my understanding of the code flow there is no situation it is set true. Is this flag a remainder from previous implementations or am I just looking at the wrong place? Furthermore I'm wondering about the size of the coarse problem. From my understanding it should include all interface DOF's? But the size I get is the number of subdomains... Last but not least I wanted to thank for the fast help you provide and to apologize for the questions, which may seem rather stupid but help me to find the right understanding. cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090629/d24aa7c0/attachment.pgp
PCNN-preconditioner and floating domains
Barry Smith schrieb: On Jun 29, 2009, at 4:55 PM, Andreas Grassl wrote: Furthermore I'm wondering about the size of the coarse problem. From my understanding it should include all interface DOF's? But the size I get is the number of subdomains... It should be the number of subdomains times the dimension of the null space for the subdomains. For Laplacian that is just the number of subdomains. For 3d linear elasticity it is 6 times the number of subdomains. Ok, slowly I get an idea where I have to change the code (at least I want to give it a try). cheers, ando -- /\ \ / ASCII Ribbon Xagainst HTML email / \ -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090630/b7d3f81e/attachment.pgp
PCNN-preconditioner and floating domains
Hello again, the issues from my last request regarding VecView and data distribution among the different processors are solved, but I'm experiencing still great problems on the performance of the actual preconditioner. I have an implementation of the BNN-algorithm in Matlab from a previous project which is performing very well (about 5 iterations vs. 200 for plain-cg) for a long linear elastic beam fixed at one end and loaded at the other end, discretized with solid cubic bricks (8 nodes, 24 DOF's). condition of the Matrix: 1.5e7 I now modeled a similar beam in DIANA (a bit shorter, less elements due to restrictions of DIANA-preprocessor) and tried to solve with PETSc-solver. The condition of the Matrix is of the same magnitude: ~3e7 (smallest singular value: ~1e-3, largest sv: ~4e4), number of iterations for plain-cg seems reasonable (437), but for the preconditioned system I get completely unexpected values: condition: ~7e12 (smallest sv: ~1, largest sv: ~7e12) and therefore 612 iterations for cg. The beam is divided in 4 subdomains. For more subdomains ksp ran out of iterations (Converged_Reason -3). I can imagine this is a problem of properly setting the null space, because only the first subdomain is touching the boundary, but I have no idea how to specify the null space. So far I didn't regard this issue at all. Do I have to define a function which calculates the weights of the interface DOF's and applies in a some way to create an orthonormal basis? How do I realize that? Is there anywhere an example? Cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
PCNN-preconditioner and floating domains
Barry Smith schrieb: As we have said before the BNN in PETSc is ONLY implemented for a scalar PDE with a null space of the constant functions. If you shove in linear elasticity it isn't really going to work. Do you have any suggestions to work around this drawback? Do I understand right, that this issue is problematic, if floating subdomains appear? Do I have the possibility to provide the null space from user site? Or how big would be the effort to change the nn-code to work for this problem classes? Is such a work useful at all or do you regard BNN only a rather complicated method to be done much better by other (even simpler) algorithms? Cheers, Ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
VecView behaviour
Hi Jed, the BNN-Algorithm in the literature distinguishes always between inner nodes and interface nodes. The short question arising from your explanation for me is, if owned DOF's is a synonym for the inner DOF's and ghosted DOF's for the interface DOF's? Below you find more extended thoughts and an example. Jed Brown schrieb: Andreas Grassl wrote: Barry Smith schrieb: Hmm, it sounds like the difference between local ghosted vectors and the global parallel vectors. But I do not understand why any of the local vector entries would be zero. Doesn't the vector X that is passed into KSP (or SNES) have the global entries and uniquely define the solution? Why is viewing that not right? I still don't understand fully the underlying processes of the whole PCNN solution procedure, but trying around I substituted MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, gridmapping, A); This creates a matrix that is bigger than you want, and gives you the dead values at the end (global dofs that are not in the range of the LocalToGlobalMapping. This from the note on MatCreateIS: | m and n are NOT related to the size of the map, they are the size of the part of the vector owned | by that process. m + nghosts (or n + nghosts) is the length of map since map maps all local points | plus the ghost points to global indices. by MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, gridmapping, A); This creates a matrix of the correct size, but it looks like it could easily end up with the wrong dofs owned locally. What you probably want to do is: 1. Resolve ownership just like with any other DD method. This partitions your dofs into n owned dofs and ngh ghosted dofs on each process. The global sum of n is N, the size of the global vectors that the solver will interact with. do I understand right, that owned dofs are the inner nodes and the ghosted dofs are the interface dofs? 2. Make an ISLocalToGlobalMapping where all the owned dofs come first, mapping (0..n-1) to (rstart..rstart+n-1), followed by the ghosted dofs (local index n..ngh-1) which map to remote processes. (rstart is the global index of the first owned dof) currently I set up my ISLocalToGlobalMapping by giving the processes all the dofs in arbitrary order having the effect, that the interface dofs appear more times. Attached I give you a small example with 2 subdomains and 270 DOF's. One way to do this is to use MPI_Scan to find rstart, then number all the owned dofs and scatter the result. The details will be dependent on how you store your mesh. (I'm assuming it's unstructured, this step is trivial if you use a DA.) Yes, the mesh is unstructured, I read out from the FE-package the partitioning at element-basis, loop over all elements to find the belonging DOF's and assemble the index vector for the ISLocalToGlobalMapping this way, without regarding interface DOF's, thinking this would be done automatically by setting up the mapping because by this some global DOF's appear more times. 3. Call MatCreateIS(comm,n,n,PETSC_DECIDE,PETSC_DECIDE,mapping,A); Seeing this function call and interpreting the owned DOF's as the subdomain inner DOF's the Matrix A has not the full size?! Given a 4x6 grid with 1 DOF per node divided into 4 subdomains I get 9 interface DOF's. 0 o o O o 5 | 6 o o O o o | O--O--O--O--O--O | o o o O o 23 My first approach to create the Matrix would give a Matrix size of 35x35, with 11 dead entries at the end of the vector. My second approach would give the correct Matrix size of 24x24. By splitting up in n owned values and some ghosted values I would expect to receive a Matrix of size 15x15. Otherwise I don't see how I could partition the grid in a consistent way. I would really appreciate, if you could show me, how the partition and ownership of the DOF's in this little example work out. cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
VecView behaviour
The promised mapping for the last post -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 -- next part -- An embedded and charset-unspecified text was scrubbed... Name: mapping.dat URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090610/04b48196/attachment.diff
VecView behaviour
Jed Brown schrieb: Does this help? I guess, this will help me a lot, but I think I'll give it a try tomorrow, since it is already higher afternoon here and I still have some unfinished work. Thank you so far! cu ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
VecView behaviour
Barry Smith schrieb: Run with GMRES, what happens? Same behaviour... On Jun 4, 2009, at 11:07 AM, Andreas Grassl wrote: Barry Smith schrieb: On Jun 3, 2009, at 5:29 PM, Andreas Grassl wrote: Barry Smith schrieb: When properly running nn-cg (are you sure everything is symmetric?) should require 10-30 iterations (certainly for model problems) ok, this was the number I expected. nn-cg on 2 nodes 229 iterations, condition 6285 nn-cg on 4 nodes 331 iterations, condition 13312 Are you sure that your operator has the null space of only constants? no, I didn't touch anything regarding the null space since I thought it would be done inside the NN-preconditioner. Does this mean I have to set up a null space of the size of the Schur complement system, i.e. the number of interface DOF's? No, I don't think you need to do anything about the null space. The code in PETSc for NN is for (and only for) a null space of constants. BTW: with 2 or 4 subdomains they all touch the boundary and likely don't have a null space anyways. Run with -ksp_view and make sure the local solves are being done with LU I don't find the anomalies... setting local_ksp-rtol to 1e-8 doesn't change anything the options passed are: -is_localD_ksp_type preonly -is_localD_pc_factor_shift_positive_definite -is_localD_pc_type lu -is_localN_ksp_type preonly -is_localN_pc_factor_shift_positive_definite -is_localN_pc_type lu -ksp_rtol 1e-8 -ksp_view #-is_localD_ksp_view #-is_localN_ksp_view #-nn_coarse_ksp_view # -pc_is_remove_nullspace_fixed this option doesn't produce any effect -log_summary -options_left and produce: -ksp_view: KSP Object: type: cg maximum iterations=1 tolerances: relative=1e-08, absolute=1e-50, divergence=1 left preconditioning PC Object: type: nn linear system matrix = precond matrix: Matrix Object: type=is, rows=28632, cols=28632 Matrix Object:(is) type=seqaij, rows=7537, cols=7537 total: nonzeros=359491, allocated nonzeros=602960 using I-node routines: found 4578 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7515, cols=7515 total: nonzeros=349347, allocated nonzeros=601200 using I-node routines: found 5159 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7533, cols=7533 total: nonzeros=357291, allocated nonzeros=602640 using I-node routines: found 4739 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7360, cols=7360 total: nonzeros=364390, allocated nonzeros=588800 using I-node routines: found 3602 nodes, limit used is 5 -is_local...: KSP Object:(is_localD_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=1 left preconditioning PC Object:(is_localD_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: using Manteuffel shift LU: factor fill ratio needed 4.73566 Factored matrix follows Matrix Object: type=seqaij, rows=6714, cols=6714 package used to perform factorization: petsc total: nonzeros=1479078, allocated nonzeros=1479078 using I-node routines: found 2790 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=6714, cols=6714 total: nonzeros=312328, allocated nonzeros=312328 using I-node routines: found 4664 nodes, limit used is 5 KSP Object:(is_localN_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=1 left preconditioning PC Object:(is_localN_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: using Manteuffel shift LU: factor fill ratio needed 5.07571 Factored matrix follows Matrix Object: type=seqaij, rows=7537, cols=7537 package used to perform factorization: petsc total: nonzeros=1824671, allocated nonzeros=1824671 using I-node routines: found 2939 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object:(is) type=seqaij, rows=7537, cols=7537 total: nonzeros=359491, allocated nonzeros=602960 using I-node routines: found 4578 nodes, limit used is 5 -nn_coarse_ksp_view: KSP Object:(nn_coarse_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=1 left preconditioning PC Object:(nn_coarse_) type: redundant Redundant preconditioner: First (color=0) of 4 PCs follows KSP Object:(redundant_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=1 left preconditioning PC Object
VecView behaviour
Andreas Grassl schrieb: Barry Smith schrieb: Run with GMRES, what happens? Same behaviour... i.e. 443 iterations. Although I noticed some differences between giving the ksp and pc type hardcoded or as runtime options. (443 vs. 354 its on gmres and 339 vs. 333 on cg) I don't have to reorder the matrix manually, right? output of -ksp_view KSP Object: type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1 tolerances: relative=1e-08, absolute=1e-50, divergence=1 left preconditioning PC Object: type: nn linear system matrix = precond matrix: Matrix Object: type=is, rows=28632, cols=28632 Matrix Object:(is) type=seqaij, rows=7537, cols=7537 total: nonzeros=359491, allocated nonzeros=602960 using I-node routines: found 4578 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7515, cols=7515 total: nonzeros=349347, allocated nonzeros=601200 using I-node routines: found 5159 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7533, cols=7533 total: nonzeros=357291, allocated nonzeros=602640 using I-node routines: found 4739 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7360, cols=7360 total: nonzeros=364390, allocated nonzeros=588800 using I-node routines: found 3602 nodes, limit used is 5 cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
VecView behaviour
Thank you for the explanation, first I'll try the null space set up and then I come back to your hints. Jed Brown schrieb: Andreas Grassl wrote: Barry Smith schrieb: Hmm, it sounds like the difference between local ghosted vectors and the global parallel vectors. But I do not understand why any of the local vector entries would be zero. Doesn't the vector X that is passed into KSP (or SNES) have the global entries and uniquely define the solution? Why is viewing that not right? I still don't understand fully the underlying processes of the whole PCNN solution procedure, but trying around I substituted MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, gridmapping, A); This creates a matrix that is bigger than you want, and gives you the dead values at the end (global dofs that are not in the range of the LocalToGlobalMapping. This from the note on MatCreateIS: | m and n are NOT related to the size of the map, they are the size of the part of the vector owned | by that process. m + nghosts (or n + nghosts) is the length of map since map maps all local points | plus the ghost points to global indices. by MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, gridmapping, A); This creates a matrix of the correct size, but it looks like it could easily end up with the wrong dofs owned locally. What you probably want to do is: 1. Resolve ownership just like with any other DD method. This partitions your dofs into n owned dofs and ngh ghosted dofs on each process. The global sum of n is N, the size of the global vectors that the solver will interact with. 2. Make an ISLocalToGlobalMapping where all the owned dofs come first, mapping (0..n-1) to (rstart..rstart+n-1), followed by the ghosted dofs (local index n..ngh-1) which map to remote processes. (rstart is the global index of the first owned dof) One way to do this is to use MPI_Scan to find rstart, then number all the owned dofs and scatter the result. The details will be dependent on how you store your mesh. (I'm assuming it's unstructured, this step is trivial if you use a DA.) 3. Call MatCreateIS(comm,n,n,PETSC_DECIDE,PETSC_DECIDE,mapping,A); Furthermore it seems, that the load balance is now better, although I still don't reach the expected values, e.g. ilu-cg 320 iterations, condition 4601 cg only 1662 iterations, condition 84919 nn-cg on 2 nodes 229 iterations, condition 6285 nn-cg on 4 nodes 331 iterations, condition 13312 or is it not to expect, that nn-cg is faster than ilu-cg? It depends a lot on the problem. As you probably know, for a second order elliptic problem with exact subdomain solves, the NN preconditioned operator (without a coarse component) has condition number that scales as (1/H^2)(1 + log(H/h))^2 where H is the subdomain diameter and h is the element size. In contrast, overlapping additive Schwarz is 1/H^2 and block Jacobi is 1/(Hh) (the original problem was 1/h^2) In particular, there is no reason to expect that NN is uniformly better than ASM, although it may be for certain problems. When a coarse solve is used, NN becomes (1 + log(H/h))^2 which is quasi-optimal (these methods are known as BDDC, which is essentially equivalent to FETI-DP). The key advantage over multigrid (or multilivel Schwarz) is improved robustness with variable coefficients. My understanding is that PCNN is BDDC, and uses direct subdomain solves by default, but I could have missed something. In particular, if the coarse solve is missing or inexact solves are used, you could easily see relatively poor scaling. AFAIK, it's not for vector problems at this time. Good luck. Jed -- /\ \ / ASCII Ribbon Xagainst HTML email / \ -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090604/b76217f4/attachment.pgp
VecView behaviour
Barry Smith schrieb: On Jun 3, 2009, at 5:29 PM, Andreas Grassl wrote: Barry Smith schrieb: When properly running nn-cg (are you sure everything is symmetric?) should require 10-30 iterations (certainly for model problems) ok, this was the number I expected. nn-cg on 2 nodes 229 iterations, condition 6285 nn-cg on 4 nodes 331 iterations, condition 13312 Are you sure that your operator has the null space of only constants? no, I didn't touch anything regarding the null space since I thought it would be done inside the NN-preconditioner. Does this mean I have to set up a null space of the size of the Schur complement system, i.e. the number of interface DOF's? No, I don't think you need to do anything about the null space. The code in PETSc for NN is for (and only for) a null space of constants. BTW: with 2 or 4 subdomains they all touch the boundary and likely don't have a null space anyways. Run with -ksp_view and make sure the local solves are being done with LU I don't find the anomalies... setting local_ksp-rtol to 1e-8 doesn't change anything the options passed are: -is_localD_ksp_type preonly -is_localD_pc_factor_shift_positive_definite -is_localD_pc_type lu -is_localN_ksp_type preonly -is_localN_pc_factor_shift_positive_definite -is_localN_pc_type lu -ksp_rtol 1e-8 -ksp_view #-is_localD_ksp_view #-is_localN_ksp_view #-nn_coarse_ksp_view # -pc_is_remove_nullspace_fixed this option doesn't produce any effect -log_summary -options_left and produce: -ksp_view: KSP Object: type: cg maximum iterations=1 tolerances: relative=1e-08, absolute=1e-50, divergence=1 left preconditioning PC Object: type: nn linear system matrix = precond matrix: Matrix Object: type=is, rows=28632, cols=28632 Matrix Object:(is) type=seqaij, rows=7537, cols=7537 total: nonzeros=359491, allocated nonzeros=602960 using I-node routines: found 4578 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7515, cols=7515 total: nonzeros=349347, allocated nonzeros=601200 using I-node routines: found 5159 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7533, cols=7533 total: nonzeros=357291, allocated nonzeros=602640 using I-node routines: found 4739 nodes, limit used is 5 Matrix Object:(is) type=seqaij, rows=7360, cols=7360 total: nonzeros=364390, allocated nonzeros=588800 using I-node routines: found 3602 nodes, limit used is 5 -is_local...: KSP Object:(is_localD_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=1 left preconditioning PC Object:(is_localD_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: using Manteuffel shift LU: factor fill ratio needed 4.73566 Factored matrix follows Matrix Object: type=seqaij, rows=6714, cols=6714 package used to perform factorization: petsc total: nonzeros=1479078, allocated nonzeros=1479078 using I-node routines: found 2790 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object: type=seqaij, rows=6714, cols=6714 total: nonzeros=312328, allocated nonzeros=312328 using I-node routines: found 4664 nodes, limit used is 5 KSP Object:(is_localN_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=1 left preconditioning PC Object:(is_localN_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: using Manteuffel shift LU: factor fill ratio needed 5.07571 Factored matrix follows Matrix Object: type=seqaij, rows=7537, cols=7537 package used to perform factorization: petsc total: nonzeros=1824671, allocated nonzeros=1824671 using I-node routines: found 2939 nodes, limit used is 5 linear system matrix = precond matrix: Matrix Object:(is) type=seqaij, rows=7537, cols=7537 total: nonzeros=359491, allocated nonzeros=602960 using I-node routines: found 4578 nodes, limit used is 5 -nn_coarse_ksp_view: KSP Object:(nn_coarse_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=1 left preconditioning PC Object:(nn_coarse_) type: redundant Redundant preconditioner: First (color=0) of 4 PCs follows KSP Object:(redundant_) type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=1 left preconditioning PC Object:(redundant_) type: lu LU: out-of-place factorization matrix ordering: nd LU: tolerance for zero pivot 1e-12 LU: factor fill
VecView behaviour
Barry Smith schrieb: Hmm, it sounds like the difference between local ghosted vectors and the global parallel vectors. But I do not understand why any of the local vector entries would be zero. Doesn't the vector X that is passed into KSP (or SNES) have the global entries and uniquely define the solution? Why is viewing that not right? I still don't understand fully the underlying processes of the whole PCNN solution procedure, but trying around I substituted MatCreateIS(commw, ind_length, ind_length, PETSC_DECIDE, PETSC_DECIDE, gridmapping, A); by MatCreateIS(commw, PETSC_DECIDE, PETSC_DECIDE, actdof, actdof, gridmapping, A); and received the needed results. Furthermore it seems, that the load balance is now better, although I still don't reach the expected values, e.g. ilu-cg 320 iterations, condition 4601 cg only 1662 iterations, condition 84919 nn-cg on 2 nodes 229 iterations, condition 6285 nn-cg on 4 nodes 331 iterations, condition 13312 or is it not to expect, that nn-cg is faster than ilu-cg? cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
VecView behaviour
Barry Smith schrieb: On May 29, 2009, at 4:34 AM, Andreas Grassl wrote: Hello, I'm working with the PCNN preconditioner and hence with ISLocalToGlobalMapping. After solving I want to write the solution to an ASCII-file where only the values belonging to the external global numbering are given and not followed ^^^ by the zeros. What do you mean? What parts of the vector do you want? I want the first actdof entries actdof is the number of DOF the system has. the values of indices is in the range of 0 to actdof-1. I create the mapping by ISLocalToGlobalMappingCreate(commw,ind_length,indices,gridmapping); Due to the existence of interface DOF's the sum over all ind_length is greather than actdof, namely the size of the Vectors, but only actdof entries of this Vector are nonzero, if I view it. Currently I'm giving this commands: ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_SYMMODU);CHKERRQ(ierr); ierr = VecView(X,viewer);CHKERRQ(ierr); I hope you got an idea, what problem I have. Cheers, ando -- /\ \ / ASCII Ribbon Xagainst HTML email / \ -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090602/e2e720ad/attachment.pgp
VecView behaviour
Hello, I'm working with the PCNN preconditioner and hence with ISLocalToGlobalMapping. After solving I want to write the solution to an ASCII-file where only the values belonging to the external global numbering are given and not followed by the zeros. Currently I'm giving this commands: ierr = PetscViewerSetFormat(viewer,PETSC_VIEWER_ASCII_SYMMODU);CHKERRQ(ierr); ierr = VecView(X,viewer);CHKERRQ(ierr); Does anybody have an idea, which option or function could help me? cheers, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
Preallocating Matrix
Hello, I'm assembling large matrices giving just the numbers of zero per row and wondering if it is possible to extract the nonzero-structure in array-format it can be fed again into MatSeqAIJSetPreallocation(Mat B,PetscInt nz,const PetscInt nnz[]) to detect the bottleneck? cheers ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
problems with MatLoad
Matthew Knepley schrieb: On Wed, Apr 8, 2009 at 11:12 AM, Andreas Grassl Andreas.Grassl at student.uibk.ac.at mailto:Andreas.Grassl at student.uibk.ac.at wrote: Hello, I got some success on the localtoglobalmapping, but now I'm stuck with writing to/reading from files. In a sequential code I write out some matrices with PetscViewerBinaryOpen(comms,matrixname,FILE_MODE_WRITE,viewer); for (k=0;knp;k++){ MatView(AS[k],viewer);} PetscViewerDestroy(viewer); and want to read them in in a parallel program, where each processor should own one matrix: ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,matrixname,FILE_MODE_READ,viewer);CHKERRQ(ierr); The Viewer has COMM_WORLD, but you are reading a matrix with COMM_SELF. You need to create a separate viewer for each process to do what you want. Thank you for the fast answer. I resolved this issue now, but how could i gather the Matrix from COMM_SELF to COMM_WORLD. I searched for functions doing such matrix copying, but MatConvert and MatCopy act on the same communicator. Thanks in advance ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
problems with MatLoad
Andreas Grassl schrieb: Matthew Knepley schrieb: The Viewer has COMM_WORLD, but you are reading a matrix with COMM_SELF. You need to create a separate viewer for each process to do what you want. Thank you for the fast answer. I resolved this issue now, but how could i gather the Matrix from COMM_SELF to COMM_WORLD. I searched for functions doing such matrix copying, but MatConvert and MatCopy act on the same communicator. I think I solved this as well, I simply invoked MatGetValues on one communicator and MatSetValues on the other. Kind Regards, ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
problems with MatLoad
Hello, I got some success on the localtoglobalmapping, but now I'm stuck with writing to/reading from files. In a sequential code I write out some matrices with PetscViewerBinaryOpen(comms,matrixname,FILE_MODE_WRITE,viewer); for (k=0;knp;k++){ MatView(AS[k],viewer);} PetscViewerDestroy(viewer); and want to read them in in a parallel program, where each processor should own one matrix: ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,matrixname,FILE_MODE_READ,viewer); CHKERRQ(ierr); ierr = MatLoad(viewer,MATSEQAIJ,AS[rank]);CHKERRQ(ierr); ierr = MatAssemblyBegin(AS[rank], MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = MatAssemblyEnd(AS[rank], MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); ierr = PetscViewerDestroy(viewer);CHKERRQ(ierr); The program is hanging in the line with MatLoad and giving following output on every node: [0]PETSC ERROR: - Error Message [0]PETSC ERROR: Argument out of range! [0]PETSC ERROR: Comm must be of size 1! I tried to sequentialize with PetscSequentialPhaseBegin(PETSC_COMM_WORLD,1) and performing the file read with a loop. Any suggestions what could go wrong? thank you ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
PCNN preconditioner and setting the interface
Barry Smith schrieb: On Mar 24, 2009, at 11:34 AM, Andreas Grassl wrote: Barry Smith schrieb: On Mar 24, 2009, at 8:45 AM, Andreas Grassl wrote: Hello, I'm working with a FE-Software where I get out the element stiffness matrices and the element-node correspondency to setup the stiffness matrix for solving with PETSc. I'm currently fighting with the interface definition. My LocalToGlobalMapping for test-purposes was the identity-IS, but I guess this is far from the optimum, because nowhere is defined a node set of interface nodes. How do I declare the interface? Is it simply a reordering of the nodes, the inner nodes are numbered first and the interface nodes last? Here's the deal. Over all the processors you have to have a single GLOBAL numbering of the nodes. The first process starts with 0 and each process starts off with one more than then previous process had. I am confused now, because after you said to use MatSetValuesLocal() to put the values in the matrix, i thought local means the unique (sequential) numbering independent of the processors in use and global a processor-specific (parallel) numbering. No, each process has its own independent local numbering from 0 to nlocal-1 the islocaltoglobalmapping you create gives the global number for each local number. So the single GLOBAL numbering is the numbering obtained from the FE-Software represented by {0,...,23} 0 o o O o 5 | 6 o o O o o | O--O--O--O--O--O | o o o O o 23 And I set the 4 different local numberings {0,...,11}, {0,...,8}, {0,...7}, {0,...,5} with the call of ISLocalToGlobalMappingCreate? How do I set the different indices? {0,1,2,3,6,7,8,9,12,13,14,15} would be the index vector for the upper left subdomain and {3,9,12,13,14,15} the index vector for the interface f it. I don't understand your figure, but I don't think it matters. It is a 2D grid arising from a FE-discretization with 4 node-elements. The small o-nodes are inner nodes, the big O-nodes are interface nodes, numbered row-wise from upper left to lower right. Let's assume this node numbers correspond to the DOF-number in the system of equation and we don't regard the boundary for now, so I receive a 24x24 Matrix which has to be partitioned into 4 subdomains. The struct PC_IS defined in src/ksp/pc/impls/is/pcis.h contains IS holding such an information (I suppose at least), but I have no idea how to use them efficiently. Do I have to manage a PC_IS object for every subdomain? In the way it is implemented EACH process has ONE subdomain. Thus each process has ONE local to global mapping. Is there a possibility to set this mentioned mapping from my global view? Or do I have to assemble the matrix locally? You are getting yourself confused thinking things are more complicated than they really are. I'll try to change my point of view to understand the things easier ;-) cheers ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
PCNN preconditioner and setting the interface
Hello, I'm working with a FE-Software where I get out the element stiffness matrices and the element-node correspondency to setup the stiffness matrix for solving with PETSc. I'm currently fighting with the interface definition. My LocalToGlobalMapping for test-purposes was the identity-IS, but I guess this is far from the optimum, because nowhere is defined a node set of interface nodes. How do I declare the interface? Is it simply a reordering of the nodes, the inner nodes are numbered first and the interface nodes last? Thank you in advance ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
PCNN preconditioner
Barry Smith schrieb: For the subdomain solves the prefix are is_localD_ and is_localN_ so you should use the options -is_localD_pc_factor_shift_positive_definite and -is_localN_pc_factor_shift_positive_definite With both options it is working now. There is currently no subroutine that pulls out the inner KSP's for the Neuman and Dirichlet problems for use in the code; though there should be PCISGetDPC() and PCISGetNPC() that would get the pointer to ksp_N and ksp_D objects inside the PC_IS data structured defined in src/ksp/pc/impls/is/pcis.h You can easily add these routines. Then use them to get the inner PC and set the shift option (and anything else you want to set). I'll try later, for now I'm happy with the options Thank you for helping, I'll give some feedback cheers ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
PCNN preconditioner
any suggestions? cheers ando Andreas Grassl schrieb: Barry Smith schrieb: Use MatCreateIS() to create the matrix. Use MatSetValuesLocal() to put the values in the matrix then use PCSetType(pc,PCNN); to set the preconditioner to NN. I followed your advice, but still run into problems. my sourcecode: ierr = KSPCreate(comm,solver);CHKERRQ(ierr); ierr = KSPSetOperators(solver,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); ierr = KSPSetInitialGuessNonzero(solver,PETSC_TRUE);CHKERRQ(ierr); ierr = KSPGetPC(solver,prec);CHKERRQ(ierr); ierr = PCSetType(prec,PCNN);CHKERRQ(ierr); //ierr = PCFactorSetShiftPd(prec,PETSC_TRUE);CHKERRQ(ierr); ierr = KSPSetUp(solver);CHKERRQ(ierr); ierr = KSPSolve(solver,B,X);CHKERRQ(ierr); and the error message: [0]PETSC ERROR: - Error Message [0]PETSC ERROR: Detected zero pivot in LU factorization see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! [0]PETSC ERROR: Zero pivot row 801 value 2.78624e-13 tolerance 4.28598e-12 * rowsum 4.28598! [0]PETSC ERROR: [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 3, Fri Jan 30 17:55:56 CST 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: [0]PETSC ERROR: Unknown Name on a linux64-g named mat1.uibk.ac.at by csae1801 Fri Feb 27 10:12:34 2009 [0]PETSC ERROR: Libraries linked from /home/lux/csae1801/petsc/petsc-3.0.0-p3/linux64-gnu-c-debug/lib [0]PETSC ERROR: Configure run at Wed Feb 18 10:30:58 2009 [0]PETSC ERROR: Configure options --with-64-bit-indices --with-scalar-type=real --with-precision=double --with-cc=icc --with-fc=ifort --with-cxx=icpc --with-shared=0 --with-mpi=1 --download-mpich=ifneeded --with-scalapack=1 --download-scalapack=ifneeded --download-f-blas-lapack=yes --with-blacs=1 --download-blacs=yes PETSC_ARCH=linux64-gnu-c-debug [0]PETSC ERROR: [0]PETSC ERROR: MatLUFactorNumeric_Inode() line 1335 in src/mat/impls/aij/seq/inode.c [0]PETSC ERROR: MatLUFactorNumeric() line 2338 in src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_LU() line 222 in src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: PCISSetUp() line 137 in src/ksp/pc/impls/is/pcis.c [0]PETSC ERROR: PCSetUp_NN() line 28 in src/ksp/pc/impls/is/nn/nn.c [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: User provided function() line 1274 in petscsolver.c Running PCFactorSetShift doesn't affect the output. any ideas? cheers ando -- /\ \ / ASCII Ribbon Xagainst HTML email / \ -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090303/3c383a2d/attachment.pgp
PCNN preconditioner
Barry Smith schrieb: Use MatCreateIS() to create the matrix. Use MatSetValuesLocal() to put the values in the matrix then use PCSetType(pc,PCNN); to set the preconditioner to NN. I followed your advice, but still run into problems. my sourcecode: ierr = KSPCreate(comm,solver);CHKERRQ(ierr); ierr = KSPSetOperators(solver,A,A,DIFFERENT_NONZERO_PATTERN);CHKERRQ(ierr); ierr = KSPSetInitialGuessNonzero(solver,PETSC_TRUE);CHKERRQ(ierr); ierr = KSPGetPC(solver,prec);CHKERRQ(ierr); ierr = PCSetType(prec,PCNN);CHKERRQ(ierr); //ierr = PCFactorSetShiftPd(prec,PETSC_TRUE);CHKERRQ(ierr); ierr = KSPSetUp(solver);CHKERRQ(ierr); ierr = KSPSolve(solver,B,X);CHKERRQ(ierr); and the error message: [0]PETSC ERROR: - Error Message [0]PETSC ERROR: Detected zero pivot in LU factorization see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! [0]PETSC ERROR: Zero pivot row 801 value 2.78624e-13 tolerance 4.28598e-12 * rowsum 4.28598! [0]PETSC ERROR: [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 3, Fri Jan 30 17:55:56 CST 2009 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: [0]PETSC ERROR: Unknown Name on a linux64-g named mat1.uibk.ac.at by csae1801 Fri Feb 27 10:12:34 2009 [0]PETSC ERROR: Libraries linked from /home/lux/csae1801/petsc/petsc-3.0.0-p3/linux64-gnu-c-debug/lib [0]PETSC ERROR: Configure run at Wed Feb 18 10:30:58 2009 [0]PETSC ERROR: Configure options --with-64-bit-indices --with-scalar-type=real --with-precision=double --with-cc=icc --with-fc=ifort --with-cxx=icpc --with-shared=0 --with-mpi=1 --download-mpich=ifneeded --with-scalapack=1 --download-scalapack=ifneeded --download-f-blas-lapack=yes --with-blacs=1 --download-blacs=yes PETSC_ARCH=linux64-gnu-c-debug [0]PETSC ERROR: [0]PETSC ERROR: MatLUFactorNumeric_Inode() line 1335 in src/mat/impls/aij/seq/inode.c [0]PETSC ERROR: MatLUFactorNumeric() line 2338 in src/mat/interface/matrix.c [0]PETSC ERROR: PCSetUp_LU() line 222 in src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: PCISSetUp() line 137 in src/ksp/pc/impls/is/pcis.c [0]PETSC ERROR: PCSetUp_NN() line 28 in src/ksp/pc/impls/is/nn/nn.c [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: User provided function() line 1274 in petscsolver.c Running PCFactorSetShift doesn't affect the output. any ideas? cheers ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
PCNN preconditioner
Hello, I'm working with a FE-Software where I get out the element stiffness matrices and the element-node correspondency to setup the stiffness matrix for solving with PETSc. Currently this is working fine for the seqaij matrix type and connected solvers. Now I want to Implement the PCNN preconditioner. Does anybody have any simple examples with this preconditioner? Thank you in advance ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
PCNN preconditioner
Ok, this would be easy. I didn't try it, because I was thinking, that I have to mess around with more ISLocalToGlobal-stuff. I'll try out and give a feedback. cheers ando Barry Smith schrieb: I'm sorry there is no example of the use of PCNN. Here is what you need to do. Use MatCreateIS() to create the matrix. Use MatSetValuesLocal() to put the values in the matrix then use PCSetType(pc,PCNN); to set the preconditioner to NN. Barry Note: You cannot just use MatCreateMPIAIJ() to create the matrix because it needs the unassembled per processor parts to build the preconditioner. You cannot use MatSetValues() because then it could not have the unassembled matrix. On Feb 26, 2009, at 10:12 AM, Andreas Grassl wrote: Hello, I'm working with a FE-Software where I get out the element stiffness matrices and the element-node correspondency to setup the stiffness matrix for solving with PETSc. Currently this is working fine for the seqaij matrix type and connected solvers. Now I want to Implement the PCNN preconditioner. Does anybody have any simple examples with this preconditioner? Thank you in advance ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091 -- /\ \ / ASCII Ribbon Xagainst HTML email / \ -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090226/109f2caa/attachment.pgp
config-options
Hi, I played around with some configure options and adapted the python scripts from $PETSC_DIR/config/examples, but I didn't find an option to save the current standard options to such a script-file. Is there a possibility to get all the configure options? cheers ando -- /\ Grassl Andreas \ /ASCII Ribbon Campaign Uni Innsbruck Institut f. Mathematik X against HTML emailTechnikerstr. 13 Zi 709 / \ +43 (0)512 507 6091
config-options
Satish Balay schrieb: On Mon, 16 Feb 2009, Andreas Grassl wrote: Hi, I played around with some configure options and adapted the python scripts from $PETSC_DIR/config/examples, but I didn't find an option to save the current standard options to such a script-file. Not sure I understand the question. When you run configure with any set of options - and the run is successful - you get a $PETSC_ARCH/conf/reconfigure-$PETSC_ARCH.py file - with all these options listed in it. Ok, I start from a script like this and wanted to have the default options, which were not set explicitly to appear in a script as well. Is there a possibility to get all the configure options? Hmm - you can run configure with '--help' option - and it will list all the options configure accepts. do I have to parse them out manually? cheers ando -- /\ \ / ASCII Ribbon Xagainst HTML email / \ -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090216/680cfe76/attachment.pgp
config-options
Barry Smith schrieb: The thing with PETSc config/configure.py options (and even more so with the PETSc runtime options) is that they are NOT centralized and they depend on each other. We don't even know how many there are and all their possible combinations. The only access to ALL of them is via the -help option where the current possibilities are listed. When someone adds a new component to PETSc (or to its config) its options go with that component (not in some central file). I understand I guess we could add some runtime option like -defaults that would present all of them in a machine parsible way, it might be kind of messy. How important is this? I guess, that as soon as I get more familiar with the framework I won't need it anymore, so it is not so important. And if troubles appear, I know where to ask ;-) cu soon cheers ando -- /\ \ / ASCII Ribbon Xagainst HTML email / \ -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 315 bytes Desc: OpenPGP digital signature URL: http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090216/f2ab08de/attachment.pgp