On Apr 16, 2014, at 1:38 PM, Mark Adams <[email protected]> wrote: > Ed, I fixed some integer -- PetscInt error and put #ifdefs for the new > interface to KSPSetOperators. It works for me on Edison. This is with Intel > compilers. > You seem to be dying in MatCreate... I'm guessing you are using 64 bit > indices. I fixed this in the code. Give this a try and see if it works.
I’m confused. This code works with ONLY one particular branch of PETSc; it will not work with any other branch or version!!!!! Why are you putting ifdefs for different versions in the code. Barry > > > > > On Wed, Apr 16, 2014 at 2:33 PM, Barry Smith <[email protected]> wrote: > > Mark, > > Please send configure.log and make.log and run with 4 threads and send > all output. > > Now Ed and I have had no problem running this code. But there are some > issues with running code with each thread creating their own objects. That > is, I have an another example in C that does not work. There are places where > we work with MPI attributes and they are not properly protected with locks. > This may or may not be affecting you. If you have to develop code that has > different threads create different objects you are welcome to work with Jed > and I etc in getting the thread stuff working in PETSc but this branch is not > the starting point. So basically Ed got lucky and we won’t have “real” > support for this usage of threads for a while (months at least). > > You absolutely should configure with —with-debugging —with-log=0 > > Barry > > > > On Apr 16, 2014, at 10:26 AM, Mark Adams <[email protected]> wrote: > > > I could also use your compile line. I am getting no output. > > > > > > On Wed, Apr 16, 2014 at 10:41 AM, Ed D'Azevedo <[email protected]> wrote: > > Hi Mark, > > > > I hope this back trace might be helpful. > > > > You may need to build petsc with > > ./configure \ > > --with-x=0 \ > > --with-debugging=0 \ > > --with-log=0 \ > > > > > > env003> addr2line --exe=tpetsc_madams 0x4b76e3 0x4aa3d7 0x484888 0x475798 > > 0x548aa5 0x5e5df2 0x57a353 0x4a701e 0x44e9be 0x44e1f4 > > /autofs/na3_home1/adams/petsc/src/sys/memory/mal.c:27 > > /autofs/na3_home1/adams/petsc/src/sys/utils/str.c:188 > > /autofs/na3_home1/adams/petsc/src/sys/logging/utils/eventlog.c:317 > > /autofs/na3_home1/adams/petsc/src/sys/logging/plog.c:747 > > /autofs/na3_home1/adams/petsc/src/mat/interface/dlregismat.c:145 > > /autofs/na3_home1/adams/petsc/src/mat/utils/gcreate.c:57 > > /autofs/na3_home1/adams/petsc/src/mat/impls/aij/seq/aij.c:3576 > > /autofs/na3_home1/adams/petsc/src/mat/impls/aij/seq/ftn-custom/zaijf.c:14 > > /autofs/na3_home1/efdazedo/test/PETSC/./tpetsc.F90:147 > > > > > > > > > > > > ======= Backtrace: ========= > > /lib64/libc.so.6(+0x75558)[0x2aaaba5ff558] > > /lib64/libc.so.6(cfree+0x6c)[0x2aaaba6044fc] > > ./tpetsc_madams[0x4b7759] > > /lib64/libpthread.so.0(+0xf7c0) [0x2aaaaacdb7c0] > > /lib64/libc.so.6(gsignal+0x35) [0x2aaaba5bcb55] > > /lib64/libc.so.6(abort+0x181) [0x2aaaba5be131] > > /lib64/libc.so.6(+0x7576d) [0x2aaaba5ff76d] > > /lib64/libc.so.6(+0x789f0) [0x2aaaba6029f0] > > /lib64/libc.so.6(__libc_malloc+0x77) [0x2aaaba6045e7] > > ./tpetsc_madams() [0x4b76e3] > > ./tpetsc_madams() [0x4aa3d7] > > ./tpetsc_madams() [0x484888] > > ./tpetsc_madams() [0x475798] > > ./tpetsc_madams() [0x548aa5] > > ./tpetsc_madams() [0x5e5df2] > > ./tpetsc_madams() [0x57a353] > > ./tpetsc_madams() [0x4a701e] > > ./tpetsc_madams() [0x44e9be] > > ./tpetsc_madams() [0x44e1f4] > > /lib64/libc.so.6(__libc_start_main+0xe6) [0x2aaaba5a8c36] > > ./tpetsc_madams() [0x44e0e9] > > > > > > > > > > > > > > On 04/16/2014 10:33 AM, Mark Adams wrote: > >> cc'ing petsc-dev. > >> > >> I will try it. > >> > >> On Wed, Apr 16, 2014 at 10:21 AM, Ed D'Azevedo <[email protected]> wrote: > >> > >> Hi Mark, > >> > >> I got an error when I tried the simple test code (see attached) on Titan. > >> > >> Can you try to run the attached test case to see if it will work for you? > >> > >> I have also sent the simple test code to Barry. > >> > >> > >> > >> The code seems to work with 1 thread > >> > >> env003> export OMP_NUM_THREADS=1 > >> env003> aprun -n 1 -d 16 ./tpetsc_madams > >> PETSC_VERSION_RELEASE 0 > >> PETSC_VERSION_MAJOR 3 > >> PETSC_VERSION_MINOR 4 > >> PETSC_VERSION_SUBMINOR 4 > >> PETSC_VERSION_PATCH 0 > >> PETSC_VERSION_DATE unknown > >> petsc_version_lt(3,3,0) is false > >> nthreads = 1 NCASES = 100 > >> nz = 88804 > >> Warning: ieee_inexact is signaling > >> all done > >> total time is 7.268438 > >> maxval(err) 6.9650285539069046E-011 > >> Application 4895013 resources: utime ~8s, stime ~0s, Rss ~22300, inblocks > >> ~11428, outblocks ~35848 > >> > >> > >> The code seems to have trouble using more threads > >> > >> env003> export OMP_NUM_THREADS=16 > >> env003> aprun -n 1 -d 16 ./tpetsc_madams > >> *** glibc detected *** ./tpetsc_madams: double free or corruption (!prev): > >> 0x00000000015ee0b0 *** > >> PETSC_VERSION_RELEASE 0 > >> PETSC_VERSION_MAJOR 3 > >> PETSC_VERSION_MINOR 4 > >> PETSC_VERSION_SUBMINOR 4 > >> PETSC_VERSION_PATCH 0 > >> PETSC_VERSION_DATE unknown > >> petsc_version_lt(3,3,0) is false > >> nthreads = 16 NCASES = 100 > >> tpetsc_madams: malloc.c:3091: sYSMALLOc: Assertion `(old_top == > >> (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof > >> (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) > >> (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, > >> fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - > >> 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == > >> 0)' failed. > >> Error: abort > >> rax 0000000000000000, rbx 0000000000000fff, rcx ffffffffffffffff > >> rdx 0000000000000006, rsp 00002aaac4554f78, rbp 00002aaad0000098 > >> rsi 0000000000005200, rdi 00000000000051f1, r8 00000000ffffffff > >> r9 00002aaaba8f9e40, r10 0000000000000008, r11 0000000000000202 > >> r12 0000000000000000, r13 00002aaad0008e30, r14 0000000000000020 > >> r15 0000000000000000 > >> ======= Backtrace: ========= > >> /lib64/libc.so.6(+0x75558)[0x2aaaba5ff558] > >> /lib64/libc.so.6(cfree+0x6c)[0x2aaaba6044fc] > >> ./tpetsc_madams[0x4b7759] > >> /lib64/libpthread.so.0(+0xf7c0) [0x2aaaaacdb7c0] > >> /lib64/libc.so.6(gsignal+0x35) [0x2aaaba5bcb55] > >> /lib64/libc.so.6(abort+0x181) [0x2aaaba5be131] > >> /lib64/libc.so.6(+0x7576d) [0x2aaaba5ff76d] > >> /lib64/libc.so.6(+0x789f0) [0x2aaaba6029f0] > >> /lib64/libc.so.6(__libc_malloc+0x77) [0x2aaaba6045e7] > >> ./tpetsc_madams() [0x4b76e3] > >> ./tpetsc_madams() [0x4aa3d7] > >> ./tpetsc_madams() [0x484888] > >> ./tpetsc_madams() [0x475798] > >> ./tpetsc_madams() [0x548aa5] > >> ./tpetsc_madams() [0x5e5df2] > >> ./tpetsc_madams() [0x57a353] > >> ./tpetsc_madams() [0x4a701e] > >> ./tpetsc_madams() [0x44e9be] > >> ./tpetsc_madams() [0x44e1f4] > >> /lib64/libc.so.6(__libc_start_main+0xe6) [0x2aaaba5a8c36] > >> ./tpetsc_madams() [0x44e0e9] > >> Application 4895021 exit codes: 127 > >> Application 4895021 resources: utime ~0s, stime ~0s, Rss ~12544, inblocks > >> ~11429, outblocks ~35849 > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> On 04/11/2014 04:34 PM, Mark Adams wrote: > >>> PETSc dev 'master' now has Barry's thread safe stuff so you should be > >>> able to use that. I have build it in: > >>> > >>> PETSC_DIR=/autofs/na3_home1/adams/petsc > >>> PETSC_ARCH=arch-titan-opt > >>> > >>> So try this version out. And revert the code to the repo version by > >>> doing: > >>> > >>> > git checkout poisson.F90 > >>> > >>> and any other place where #if PETSC_VERSION_GE(3,5,0) is used. I only > >>> see poisson.F90. > >>> > >>> If this works I can install it wherever you like as Ed did. > >>> > >>> Mark > >>> > >>> > >>> > >>> > >>> > >>> On Thu, Apr 10, 2014 at 11:24 PM, Seung-Hoe Ku <[email protected]> wrote: > >>> Hi Mark, > >>> > >>> I think it is not a big problem until we use petsc 3.5 other than you. > >>> Now I disabled #if #endif for other users. Sorry for the inconvenience. > >>> Could you uncomment it when you are working with 3.5, and does not commit > >>> to core_dev, please? > >>> We need Ed's thread safe petsc for performance issue, so we can find some > >>> way to resolve it when 3.5 is released and installed on titan. > >>> > >>> Thanks, > >>> Seung-Hoe > >>> > >>> > >>> > >>> On Thu, Apr 10, 2014 at 11:38 AM, Mark Adams <[email protected]> wrote: > >>> We've figured out the problem (again). It will be fixed in future > >>> versions. > >>> > >>> We can fix your installation. I'm guessing Ed did this installation so > >>> it might not be worth fixing this code since you are up and running. Let > >>> me know what you want to do. > >>> > >>> The problem is that is a development version of the code, which is not as > >>> stable as the releases. This version needs to be updated to get fix this > >>> problem. I have to tell you how to do this update so let me know if you > >>> want to do it. > >>> > >>> Mark > >>> > >>> > >>> > >>> On Thu, Apr 10, 2014 at 8:58 AM, Seung-Hoe Ku <[email protected]> wrote: > >>> One question is.. > >>> If PETSC_VERSION_GE is defined with PETSC_VERSION_GT and PETSC_VERSION_GT > >>> is defined with something in petsc.h, > >>> redefinition of PETSC_VERSION_GT after petsc.h will change > >>> PETSC_VERSION_GE? > >>> > >>> > >>> > >>> > >>> On Thu, Apr 10, 2014 at 10:55 AM, Seung-Hoe Ku <[email protected]> wrote: > >>> This is the code I tried to compile. > >>> > >>> #if PETSC_VERSION_GE(5,5,0) > >>> 214 BBBBBBBBBBBBBBBBBBB=0 > >>> 215 call KSPSetOperators(solver%ksp, solver%Amat, solver%Amat, ierr ) > >>> 216 #else > >>> 217 AAAAAAAAAAAA=0 > >>> 218 call KSPSetOperators(solver%ksp, solver%Amat, solver%Amat, > >>> SAME_NONZERO_PATT ERN, ierr ) > >>> 219 #endif > >>> > >>> > >>> > >>> On Thu, Apr 10, 2014 at 10:49 AM, Mark Adams <[email protected]> wrote: > >>> > >>> > >>> > >>> On Thu, Apr 10, 2014 at 8:25 AM, Seung-Hoe Ku <[email protected]> wrote: > >>> I have the same problem. I used poisson.F90 in > >>> /lustre/atlas2/env003/scratch/shku/XGC1_3_petsc_problem/ > >>> > >>> It seems that #undef should be after #include<finclude/petsc.h>. > >>> Otherwise, petsc.h seems to try redefine it. > >>> > >>> > >>> Oh yes. > >>> > >>> Anyway, I got the same error message: > >>> > >>> PGF90-S-0038-Symbol, bbbbbbbbbbbbbbbbbbb, has not been explicitly > >>> declared (poisson.F90) > >>> 0 inform, 0 warnings, 1 severes, 0 fatal for init_1field_solver > >>> > >>> > >>> is this in the source file? > >>> > >>> I tried 5.5.0 for the arguments of PETSC_VERSION_GE, but still have the > >>> same problem. > >>> > >>> Or do you mean VERSION_GE and VERSION_LT instead of VERSION_GT and > >>> VERSION_LE? > >>> > >>> Thanks, > >>> Seung-Hoe > >>> > >>> > >>> > >>> On Wed, Apr 9, 2014 at 9:54 PM, Mark Adams <[email protected]> wrote: > >>> We might have a fix. It turns out that some fortran compilers to not do > >>> the #define quite right. Try this: > >>> > >>> > git checkout poisson.F90 > >>> > >>> This will out the original (bad) file back. Then add to poisson.F90 > >>> > >>> #undef PETSC_VERSION_GT > >>> #define PETSC_VERSION_GT(MAJOR,MINOR,SUBMINOR) \ > >>> (0==PETSC_VERSION_LE(MAJOR,MINOR,SUBMINOR)) > >>> > >>> This is the fix that Jed thinks will work. If it works we can propagate > >>> it. > >>> > >>> Sorry about the confusion, > >>> Mark > >>> > >>> > >>> > >>> On Wed, Apr 9, 2014 at 7:36 PM, Seung-Hoe Ku <[email protected]> wrote: > >>> Great! Thank you. > >>> > >>> > >>> > >>> On Wed, Apr 9, 2014 at 9:35 PM, Mark Adams <[email protected]> wrote: > >>> Good news. I'm at a meeting and Jed Brown is here and he seems to have > >>> been able to reproduce the error. Preprocessors can be a pain. I'm > >>> going to wait until Jed has a chance to look at this and come up with a > >>> solution. > >>> > >>> > >>> On Wed, Apr 9, 2014 at 7:28 PM, Seung-Hoe Ku <[email protected]> wrote: > >>> Now I am using iterative solver which will be replaced to your 2 field > >>> solver later. > >>> Sorry. The line number 3003 is wrong. It is 209 or near it. > >>> Yes. We can take a look at this when you come. It is not urgent problem. > >>> > >>> Thanks, > >>> Seung-Hoe > >>> > >>> > >>> > >>> On Wed, Apr 9, 2014 at 9:25 PM, Mark Adams <[email protected]> wrote: > >>> I noticed that your makefile was not using the new petsc solver. Is that > >>> intensional? > >>> > >>> The line you gave me 3003 poission.F90 seems to be the last line of the > >>> file. > >>> > >>> I am not getting Edison or Titan to build. We can take a look at this > >>> next week. > >>> > >>> Mark > >>> > >>> > >>> > >>> On Tue, Apr 8, 2014 at 12:38 PM, Seung-Hoe Ku <[email protected]> wrote: > >>> Hi Mark, > >>> > >>> It seems that PETSC_VERSION_GE is not working line 3003 of poisson.F90. > >>> Is there any reason of using PETSC_VERSION_GE instead of > >>> PETSC_VERSION_LE? The previous code worked, I think. > >>> > >>> Thanks, > >>> Seung-Hoe > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >> > >> > > > > > > > <tpetsc.F90>
