Richard, I managed to get the code Simlul@trophy built. Could you tell me how to run your test? I want to see if I can reproduce the error. Thanks
--Junchao Zhang On Fri, Feb 14, 2020 at 8:34 PM Richard Beare <[email protected]> wrote: > It doesn't compile out of the box with master. > > singularity def file attached. > > On Sat, 15 Feb 2020 at 08:03, Richard Beare <[email protected]> > wrote: > >> I will see if I can build with master. The docs for simulatrophy say >> 3.6.3.1. >> >> On Sat, 15 Feb 2020 at 02:47, Junchao Zhang <[email protected]> wrote: >> >>> Which petsc version do you use? In aij.c of the master branch, I saw >>> Barry recently added a useful check to catch number of nonzero overflow, >>> ierr = PetscIntCast(nz64,&nz);CHKERRQ(ierr); But you mentioned using >>> 64-bit indices did not solve the problem, it might not be the reason. You >>> should try the master branch if feasible. Also, vary number of MPI ranks to >>> see if error stack changes. >>> >>> --Junchao Zhang >>> >>> >>> On Fri, Feb 14, 2020 at 5:12 AM Richard Beare via petsc-users < >>> [email protected]> wrote: >>> >>>> No luck - exactly the same error after including the >>>> --with-64-bit-indicies=yes --download-mpich=yes options >>>> >>>> ==8674== Argument 'size' of function memalign has a fishy (possibly >>>> negative) value: -17152036540 >>>> ==8674== at 0x4C320A6: memalign (in >>>> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) >>>> ==8674== by 0x4F0CFF2: PetscMallocAlign(unsigned long, int, char >>>> const*, char const*, void**) (mal.c:28) >>>> ==8674== by 0x4F0F716: PetscTrMallocDefault(unsigned long, int, char >>>> const*, char const*, void**) (mtr.c:188) >>>> ==8674== by 0x569AF3E: MatSeqAIJSetPreallocation_SeqAIJ (aij.c:3595) >>>> ==8674== by 0x569A531: MatSeqAIJSetPreallocation (aij.c:3539) >>>> ==8674== by 0x599080A: DMCreateMatrix_DA_3d_MPIAIJ(_p_DM*, _p_Mat*) >>>> (fdda.c:1085) >>>> ==8674== by 0x598B937: DMCreateMatrix_DA(_p_DM*, _p_Mat**) >>>> (fdda.c:759) >>>> ==8674== by 0x58A2BF2: DMCreateMatrix (dm.c:956) >>>> ==8674== by 0x5E377B3: KSPSetUp (itfunc.c:262) >>>> ==8674== by 0x409FFC: PetscAdLemTaras3D::solveModel(bool) >>>> (PetscAdLemTaras3D.hxx:255) >>>> ==8674== by 0x4239FB: AdLem3D<3u>::solveModel(bool, bool, bool) >>>> (AdLem3D.hxx:551) >>>> ==8674== by 0x41BD17: main (PetscAdLemMain.cxx:344) >>>> ==8674== >>>> On Fri, 14 Feb 2020 at 17:07, Smith, Barry F. <[email protected]> >>>> wrote: >>>> >>>>> >>>>> Richard, >>>>> >>>>> It is likely that for these problems some of the integers become >>>>> too large for the int variable to hold them, thus they overflow and become >>>>> negative. >>>>> >>>>> You should make a new PETSC_ARCH configuration of PETSc that uses >>>>> the configure option --with-64-bit-indices, this will change PETSc to use >>>>> 64 bit integers which will not overflow. >>>>> >>>>> Good luck and let us know how it works out >>>>> >>>>> Barry >>>>> >>>>> Probably the code is built with an older version of PETSc; the >>>>> later versions should produce a more useful error message. >>>>> >>>>> > On Feb 13, 2020, at 11:43 PM, Richard Beare via petsc-users < >>>>> [email protected]> wrote: >>>>> > >>>>> > Hi Everyone, >>>>> > I am experimenting with the Simlul@trophy tool ( >>>>> https://github.com/Inria-Asclepios/simul-atrophy) that uses petsc to >>>>> simulate brain atrophy based on segmented MRI data. I am not the author. I >>>>> have this running on most of a dataset of about 50 scans, but experience >>>>> crashes with several that I am trying to track down. However I am out of >>>>> ideas. The problem images are slightly bigger than some of the successful >>>>> ones, but not substantially so, and I have experimented on machines with >>>>> sufficient RAM. The error happens very quickly, as part of setup - see the >>>>> valgrind report below. I haven't managed to get the sgcheck tool to work >>>>> yet. I can only guess that the ksp object is somehow becoming corrupted >>>>> during the setup process, but the array sizes that I can track (which >>>>> derive from image sizes), appear correct at every point I can check. Any >>>>> suggestions as to how I can check what might go wrong in the setup of the >>>>> ksp object? >>>>> > Thankyou. >>>>> > >>>>> > valgrind tells me: >>>>> > >>>>> > ==18175== Argument 'size' of function memalign has a fishy (possibly >>>>> negative) value: -17152038144 >>>>> > ==18175== at 0x4C320A6: memalign (in >>>>> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) >>>>> > ==18175== by 0x4F0F1F2: PetscMallocAlign(unsigned long, int, char >>>>> const*, char const*, void**) (mal.c:28) >>>>> > ==18175== by 0x56B43CA: MatSeqAIJSetPreallocation_SeqAIJ >>>>> (aij.c:3595) >>>>> > ==18175== by 0x56B39BD: MatSeqAIJSetPreallocation (aij.c:3539) >>>>> > ==18175== by 0x59A9B44: DMCreateMatrix_DA_3d_MPIAIJ(_p_DM*, >>>>> _p_Mat*) (fdda.c:1085) >>>>> > ==18175== by 0x59A4C71: DMCreateMatrix_DA(_p_DM*, _p_Mat**) >>>>> (fdda.c:759) >>>>> > ==18175== by 0x58BBD29: DMCreateMatrix (dm.c:956) >>>>> > ==18175== by 0x5E509D5: KSPSetUp (itfunc.c:262) >>>>> > ==18175== by 0x40A3DE: PetscAdLemTaras3D::solveModel(bool) >>>>> (PetscAdLemTaras3D.hxx:269) >>>>> > ==18175== by 0x42413F: AdLem3D<3u>::solveModel(bool, bool, bool) >>>>> (AdLem3D.hxx:552) >>>>> > ==18175== by 0x41C25C: main (PetscAdLemMain.cxx:349) >>>>> > ==18175== >>>>> > >>>>> > -- >>>>> > -- >>>>> > A/Prof Richard Beare >>>>> > Imaging and Bioinformatics, Peninsula Clinical School >>>>> > orcid.org/0000-0002-7530-5664 >>>>> > [email protected] >>>>> > +61 3 9788 1724 >>>>> > >>>>> > >>>>> > >>>>> > Geospatial Research: >>>>> https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis >>>>> >>>>> >>>> >>>> -- >>>> -- >>>> A/Prof Richard Beare >>>> Imaging and Bioinformatics, Peninsula Clinical School >>>> orcid.org/0000-0002-7530-5664 >>>> [email protected] >>>> +61 3 9788 1724 >>>> >>>> >>>> >>>> Geospatial Research: >>>> https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis >>>> >>> >> >> -- >> -- >> A/Prof Richard Beare >> Imaging and Bioinformatics, Peninsula Clinical School >> orcid.org/0000-0002-7530-5664 >> [email protected] >> +61 3 9788 1724 >> >> >> >> Geospatial Research: >> https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis >> > > > -- > -- > A/Prof Richard Beare > Imaging and Bioinformatics, Peninsula Clinical School > orcid.org/0000-0002-7530-5664 > [email protected] > +61 3 9788 1724 > > > > Geospatial Research: > https://www.monash.edu/medicine/scs/medicine/research/geospatial-analysis >
