On Wed, Apr 16, 2014 at 2:43 PM, Barry Smith <[email protected]> wrote:
> > On Apr 16, 2014, at 1:38 PM, Mark Adams <[email protected]> wrote: > > > Ed, I fixed some integer -- PetscInt error and put #ifdefs for the new > interface to KSPSetOperators. It works for me on Edison. This is with > Intel compilers. > > You seem to be dying in MatCreate... I'm guessing you are using 64 bit > indices. I fixed this in the code. Give this a try and see if it works. > > I’m confused. This code works with ONLY one particular branch of PETSc; > it will not work with any other branch or version!!!!! Why are you putting > ifdefs for different versions in the code. > Good point. Ed: you could just put #error in place of the old call. > > Barry > > > > > > > > > > > On Wed, Apr 16, 2014 at 2:33 PM, Barry Smith <[email protected]> wrote: > > > > Mark, > > > > Please send configure.log and make.log and run with 4 threads and > send all output. > > > > Now Ed and I have had no problem running this code. But there are > some issues with running code with each thread creating their own objects. > That is, I have an another example in C that does not work. There are > places where we work with MPI attributes and they are not properly > protected with locks. This may or may not be affecting you. If you have to > develop code that has different threads create different objects you are > welcome to work with Jed and I etc in getting the thread stuff working in > PETSc but this branch is not the starting point. So basically Ed got lucky > and we won’t have “real” support for this usage of threads for a while > (months at least). > > > > You absolutely should configure with —with-debugging —with-log=0 > > > > Barry > > > > > > > > On Apr 16, 2014, at 10:26 AM, Mark Adams <[email protected]> wrote: > > > > > I could also use your compile line. I am getting no output. > > > > > > > > > On Wed, Apr 16, 2014 at 10:41 AM, Ed D'Azevedo <[email protected]> > wrote: > > > Hi Mark, > > > > > > I hope this back trace might be helpful. > > > > > > You may need to build petsc with > > > ./configure \ > > > --with-x=0 \ > > > --with-debugging=0 \ > > > --with-log=0 \ > > > > > > > > > env003> addr2line --exe=tpetsc_madams 0x4b76e3 0x4aa3d7 0x484888 > 0x475798 0x548aa5 0x5e5df2 0x57a353 0x4a701e 0x44e9be 0x44e1f4 > > > /autofs/na3_home1/adams/petsc/src/sys/memory/mal.c:27 > > > /autofs/na3_home1/adams/petsc/src/sys/utils/str.c:188 > > > /autofs/na3_home1/adams/petsc/src/sys/logging/utils/eventlog.c:317 > > > /autofs/na3_home1/adams/petsc/src/sys/logging/plog.c:747 > > > /autofs/na3_home1/adams/petsc/src/mat/interface/dlregismat.c:145 > > > /autofs/na3_home1/adams/petsc/src/mat/utils/gcreate.c:57 > > > /autofs/na3_home1/adams/petsc/src/mat/impls/aij/seq/aij.c:3576 > > > > /autofs/na3_home1/adams/petsc/src/mat/impls/aij/seq/ftn-custom/zaijf.c:14 > > > /autofs/na3_home1/efdazedo/test/PETSC/./tpetsc.F90:147 > > > > > > > > > > > > > > > > > > ======= Backtrace: ========= > > > /lib64/libc.so.6(+0x75558)[0x2aaaba5ff558] > > > /lib64/libc.so.6(cfree+0x6c)[0x2aaaba6044fc] > > > ./tpetsc_madams[0x4b7759] > > > /lib64/libpthread.so.0(+0xf7c0) [0x2aaaaacdb7c0] > > > /lib64/libc.so.6(gsignal+0x35) [0x2aaaba5bcb55] > > > /lib64/libc.so.6(abort+0x181) [0x2aaaba5be131] > > > /lib64/libc.so.6(+0x7576d) [0x2aaaba5ff76d] > > > /lib64/libc.so.6(+0x789f0) [0x2aaaba6029f0] > > > /lib64/libc.so.6(__libc_malloc+0x77) [0x2aaaba6045e7] > > > ./tpetsc_madams() [0x4b76e3] > > > ./tpetsc_madams() [0x4aa3d7] > > > ./tpetsc_madams() [0x484888] > > > ./tpetsc_madams() [0x475798] > > > ./tpetsc_madams() [0x548aa5] > > > ./tpetsc_madams() [0x5e5df2] > > > ./tpetsc_madams() [0x57a353] > > > ./tpetsc_madams() [0x4a701e] > > > ./tpetsc_madams() [0x44e9be] > > > ./tpetsc_madams() [0x44e1f4] > > > /lib64/libc.so.6(__libc_start_main+0xe6) [0x2aaaba5a8c36] > > > ./tpetsc_madams() [0x44e0e9] > > > > > > > > > > > > > > > > > > > > > On 04/16/2014 10:33 AM, Mark Adams wrote: > > >> cc'ing petsc-dev. > > >> > > >> I will try it. > > >> > > >> On Wed, Apr 16, 2014 at 10:21 AM, Ed D'Azevedo <[email protected]> > wrote: > > >> > > >> Hi Mark, > > >> > > >> I got an error when I tried the simple test code (see attached) on > Titan. > > >> > > >> Can you try to run the attached test case to see if it will work for > you? > > >> > > >> I have also sent the simple test code to Barry. > > >> > > >> > > >> > > >> The code seems to work with 1 thread > > >> > > >> env003> export OMP_NUM_THREADS=1 > > >> env003> aprun -n 1 -d 16 ./tpetsc_madams > > >> PETSC_VERSION_RELEASE 0 > > >> PETSC_VERSION_MAJOR 3 > > >> PETSC_VERSION_MINOR 4 > > >> PETSC_VERSION_SUBMINOR 4 > > >> PETSC_VERSION_PATCH 0 > > >> PETSC_VERSION_DATE unknown > > >> petsc_version_lt(3,3,0) is false > > >> nthreads = 1 NCASES = 100 > > >> nz = 88804 > > >> Warning: ieee_inexact is signaling > > >> all done > > >> total time is 7.268438 > > >> maxval(err) 6.9650285539069046E-011 > > >> Application 4895013 resources: utime ~8s, stime ~0s, Rss ~22300, > inblocks ~11428, outblocks ~35848 > > >> > > >> > > >> The code seems to have trouble using more threads > > >> > > >> env003> export OMP_NUM_THREADS=16 > > >> env003> aprun -n 1 -d 16 ./tpetsc_madams > > >> *** glibc detected *** ./tpetsc_madams: double free or corruption > (!prev): 0x00000000015ee0b0 *** > > >> PETSC_VERSION_RELEASE 0 > > >> PETSC_VERSION_MAJOR 3 > > >> PETSC_VERSION_MINOR 4 > > >> PETSC_VERSION_SUBMINOR 4 > > >> PETSC_VERSION_PATCH 0 > > >> PETSC_VERSION_DATE unknown > > >> petsc_version_lt(3,3,0) is false > > >> nthreads = 16 NCASES = 100 > > >> tpetsc_madams: malloc.c:3091: sYSMALLOc: Assertion `(old_top == > (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof > (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) > (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, > fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - > 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == > 0)' failed. > > >> Error: abort > > >> rax 0000000000000000, rbx 0000000000000fff, rcx ffffffffffffffff > > >> rdx 0000000000000006, rsp 00002aaac4554f78, rbp 00002aaad0000098 > > >> rsi 0000000000005200, rdi 00000000000051f1, r8 00000000ffffffff > > >> r9 00002aaaba8f9e40, r10 0000000000000008, r11 0000000000000202 > > >> r12 0000000000000000, r13 00002aaad0008e30, r14 0000000000000020 > > >> r15 0000000000000000 > > >> ======= Backtrace: ========= > > >> /lib64/libc.so.6(+0x75558)[0x2aaaba5ff558] > > >> /lib64/libc.so.6(cfree+0x6c)[0x2aaaba6044fc] > > >> ./tpetsc_madams[0x4b7759] > > >> /lib64/libpthread.so.0(+0xf7c0) [0x2aaaaacdb7c0] > > >> /lib64/libc.so.6(gsignal+0x35) [0x2aaaba5bcb55] > > >> /lib64/libc.so.6(abort+0x181) [0x2aaaba5be131] > > >> /lib64/libc.so.6(+0x7576d) [0x2aaaba5ff76d] > > >> /lib64/libc.so.6(+0x789f0) [0x2aaaba6029f0] > > >> /lib64/libc.so.6(__libc_malloc+0x77) [0x2aaaba6045e7] > > >> ./tpetsc_madams() [0x4b76e3] > > >> ./tpetsc_madams() [0x4aa3d7] > > >> ./tpetsc_madams() [0x484888] > > >> ./tpetsc_madams() [0x475798] > > >> ./tpetsc_madams() [0x548aa5] > > >> ./tpetsc_madams() [0x5e5df2] > > >> ./tpetsc_madams() [0x57a353] > > >> ./tpetsc_madams() [0x4a701e] > > >> ./tpetsc_madams() [0x44e9be] > > >> ./tpetsc_madams() [0x44e1f4] > > >> /lib64/libc.so.6(__libc_start_main+0xe6) [0x2aaaba5a8c36] > > >> ./tpetsc_madams() [0x44e0e9] > > >> Application 4895021 exit codes: 127 > > >> Application 4895021 resources: utime ~0s, stime ~0s, Rss ~12544, > inblocks ~11429, outblocks ~35849 > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> On 04/11/2014 04:34 PM, Mark Adams wrote: > > >>> PETSc dev 'master' now has Barry's thread safe stuff so you should > be able to use that. I have build it in: > > >>> > > >>> PETSC_DIR=/autofs/na3_home1/adams/petsc > > >>> PETSC_ARCH=arch-titan-opt > > >>> > > >>> So try this version out. And revert the code to the repo version by > doing: > > >>> > > >>> > git checkout poisson.F90 > > >>> > > >>> and any other place where #if PETSC_VERSION_GE(3,5,0) is used. I > only see poisson.F90. > > >>> > > >>> If this works I can install it wherever you like as Ed did. > > >>> > > >>> Mark > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> On Thu, Apr 10, 2014 at 11:24 PM, Seung-Hoe Ku <[email protected]> wrote: > > >>> Hi Mark, > > >>> > > >>> I think it is not a big problem until we use petsc 3.5 other than > you. Now I disabled #if #endif for other users. Sorry for the > inconvenience. Could you uncomment it when you are working with 3.5, and > does not commit to core_dev, please? > > >>> We need Ed's thread safe petsc for performance issue, so we can find > some way to resolve it when 3.5 is released and installed on titan. > > >>> > > >>> Thanks, > > >>> Seung-Hoe > > >>> > > >>> > > >>> > > >>> On Thu, Apr 10, 2014 at 11:38 AM, Mark Adams <[email protected]> > wrote: > > >>> We've figured out the problem (again). It will be fixed in future > versions. > > >>> > > >>> We can fix your installation. I'm guessing Ed did this installation > so it might not be worth fixing this code since you are up and running. > Let me know what you want to do. > > >>> > > >>> The problem is that is a development version of the code, which is > not as stable as the releases. This version needs to be updated to get fix > this problem. I have to tell you how to do this update so let me know if > you want to do it. > > >>> > > >>> Mark > > >>> > > >>> > > >>> > > >>> On Thu, Apr 10, 2014 at 8:58 AM, Seung-Hoe Ku <[email protected]> wrote: > > >>> One question is.. > > >>> If PETSC_VERSION_GE is defined with PETSC_VERSION_GT and > PETSC_VERSION_GT is defined with something in petsc.h, > > >>> redefinition of PETSC_VERSION_GT after petsc.h will change > PETSC_VERSION_GE? > > >>> > > >>> > > >>> > > >>> > > >>> On Thu, Apr 10, 2014 at 10:55 AM, Seung-Hoe Ku <[email protected]> wrote: > > >>> This is the code I tried to compile. > > >>> > > >>> #if PETSC_VERSION_GE(5,5,0) > > >>> 214 BBBBBBBBBBBBBBBBBBB=0 > > >>> 215 call KSPSetOperators(solver%ksp, solver%Amat, solver%Amat, > ierr ) > > >>> 216 #else > > >>> 217 AAAAAAAAAAAA=0 > > >>> 218 call KSPSetOperators(solver%ksp, solver%Amat, solver%Amat, > SAME_NONZERO_PATT ERN, ierr ) > > >>> 219 #endif > > >>> > > >>> > > >>> > > >>> On Thu, Apr 10, 2014 at 10:49 AM, Mark Adams <[email protected]> > wrote: > > >>> > > >>> > > >>> > > >>> On Thu, Apr 10, 2014 at 8:25 AM, Seung-Hoe Ku <[email protected]> wrote: > > >>> I have the same problem. I used poisson.F90 in > /lustre/atlas2/env003/scratch/shku/XGC1_3_petsc_problem/ > > >>> > > >>> It seems that #undef should be after #include<finclude/petsc.h>. > Otherwise, petsc.h seems to try redefine it. > > >>> > > >>> > > >>> Oh yes. > > >>> > > >>> Anyway, I got the same error message: > > >>> > > >>> PGF90-S-0038-Symbol, bbbbbbbbbbbbbbbbbbb, has not been explicitly > declared (poisson.F90) > > >>> 0 inform, 0 warnings, 1 severes, 0 fatal for init_1field_solver > > >>> > > >>> > > >>> is this in the source file? > > >>> > > >>> I tried 5.5.0 for the arguments of PETSC_VERSION_GE, but still have > the same problem. > > >>> > > >>> Or do you mean VERSION_GE and VERSION_LT instead of VERSION_GT and > VERSION_LE? > > >>> > > >>> Thanks, > > >>> Seung-Hoe > > >>> > > >>> > > >>> > > >>> On Wed, Apr 9, 2014 at 9:54 PM, Mark Adams <[email protected]> wrote: > > >>> We might have a fix. It turns out that some fortran compilers to > not do the #define quite right. Try this: > > >>> > > >>> > git checkout poisson.F90 > > >>> > > >>> This will out the original (bad) file back. Then add to poisson.F90 > > >>> > > >>> #undef PETSC_VERSION_GT > > >>> #define PETSC_VERSION_GT(MAJOR,MINOR,SUBMINOR) \ > > >>> (0==PETSC_VERSION_LE(MAJOR,MINOR,SUBMINOR)) > > >>> > > >>> This is the fix that Jed thinks will work. If it works we can > propagate it. > > >>> > > >>> Sorry about the confusion, > > >>> Mark > > >>> > > >>> > > >>> > > >>> On Wed, Apr 9, 2014 at 7:36 PM, Seung-Hoe Ku <[email protected]> wrote: > > >>> Great! Thank you. > > >>> > > >>> > > >>> > > >>> On Wed, Apr 9, 2014 at 9:35 PM, Mark Adams <[email protected]> wrote: > > >>> Good news. I'm at a meeting and Jed Brown is here and he seems to > have been able to reproduce the error. Preprocessors can be a pain. I'm > going to wait until Jed has a chance to look at this and come up with a > solution. > > >>> > > >>> > > >>> On Wed, Apr 9, 2014 at 7:28 PM, Seung-Hoe Ku <[email protected]> wrote: > > >>> Now I am using iterative solver which will be replaced to your 2 > field solver later. > > >>> Sorry. The line number 3003 is wrong. It is 209 or near it. > > >>> Yes. We can take a look at this when you come. It is not urgent > problem. > > >>> > > >>> Thanks, > > >>> Seung-Hoe > > >>> > > >>> > > >>> > > >>> On Wed, Apr 9, 2014 at 9:25 PM, Mark Adams <[email protected]> wrote: > > >>> I noticed that your makefile was not using the new petsc solver. Is > that intensional? > > >>> > > >>> The line you gave me 3003 poission.F90 seems to be the last line of > the file. > > >>> > > >>> I am not getting Edison or Titan to build. We can take a look at > this next week. > > >>> > > >>> Mark > > >>> > > >>> > > >>> > > >>> On Tue, Apr 8, 2014 at 12:38 PM, Seung-Hoe Ku <[email protected]> wrote: > > >>> Hi Mark, > > >>> > > >>> It seems that PETSC_VERSION_GE is not working line 3003 of > poisson.F90. Is there any reason of using PETSC_VERSION_GE instead of > PETSC_VERSION_LE? The previous code worked, I think. > > >>> > > >>> Thanks, > > >>> Seung-Hoe > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >> > > >> > > > > > > > > > > > > <tpetsc.F90> > >
