On Wed, 3 Mar 2021, Jacob Faibussowitsch wrote: > Hello All, > > I discovered a compiler bug in the IBM xl fortran compiler a few weeks ago > that would crash the compiler when compiling petsc fortran interfaces. The > TL;DR of it is that the xl compiler creates a function dictionary for every > function imported in fortran modules, and since petsc fortran interfaces seem > to import entire packages writ-large this exceeds the number of dictionary > entries (2**21): > > > The reason for the Internal Compiler Error is because we can't grow an > > interal dictionary anymore (ie we hit a 2**21 limit). > > The file contains many module procedures and interfaces that use the same > > helper module. As a result, we are importing the dictionary entries for > > that module repeatedly reaching > > the limit. > > > > Can you please give the following source code workaround a try? > > Since there is already "use petscvecdefdummy" at the module scope, one > > workaround might be to remove the unnecessary "use petscvecdefdummy" in > > vecnotequal and vecequals > > and all similar procedures.
This sounds reasonable - but the change might be tedious [to make without breaking some required dependency]. Perhaps it will also help gfortran RAM requirements.. Satish > > > > For example, the test case has: > > module petscvecdef > > use petscvecdefdummy > > ... > > function vecnotequal(A,B) > > use petscvecdefdummy > > logical vecnotequal > > type(tVec), intent(in) :: A,B > > vecnotequal = (A%v .ne. B%v) > > end function > > function vecequals(A,B) > > use petscvecdefdummy > > logical vecequals > > type(tVec), intent(in) :: A,B > > vecequals = (A%v .eq. B%v) > > end function > > ... > > end module > > Another workaround would be to put the procedure definitions from this > > large module into several submodules. Each submodule would be able to > > accommodate a dictionary with 2**21 entries. > > > > > > Please let us know if one of the above workarounds resolve the issue. > > > The proposed fix from IBM would be to pull “use moduleXXX” out of subroutines > or to have our auto-fortran interfaces detect which symbols to include from > the respective modules and only include those in the subroutines. I’m not > familiar at all with how the interfaces are generated so I don’t even know if > this is possible. > > IBM provided the following additional explanation and example. Can the > > process used to generate these routines and functions determine the > > specific symbols required and then use the only keyword or import statement > > to include them? > > > > When factoring out use statements out of module procedures, you can just > > delete them. But you can't completely remove them from interface blocks. > > Instead, you can limit them either by using use <module>, only: <symbol> or > > import <symbol> . if the hundreds of use statements in the program are > > factored out / limited in this way, that should reduce the dictionary size > > sufficiently for the program to compile. > > > > For example > > Interface > > Subroutine VecRestoreArrayReadF90(v,array,ierr) > > use petscvecdef > > real(kind=selected_real_kind(10)), pointer :: array(:) > > integer(kind=selected_int_kind(5)) ierr > > type(tVec) v > > End Subroutine > > End Interface > > > > imports all symbols from petscvecdef into the dictionary even though we > > only need tVec . So we can either: > > > > Interface > > Subroutine VecRestoreArrayReadF90(v,array,ierr) > > use petscvecdef, only: tVec > > implicit none > > real(kind=selected_real_kind(10)), pointer :: array(:) > > integer(kind=selected_int_kind(5)) ierr > > type(tVec) v > > End Subroutine > > End Interface > > > > or if use petscvecdef is used in the outer scope, we can: > > Interface > > Subroutine VecRestoreArrayReadF90(v,array,ierr) > > import tVec > > implicit none > > real(kind=selected_real_kind(10)), pointer :: array(:) > > integer(kind=selected_int_kind(5)) ierr > > type(tVec) v > > End Subroutine > > End Interface > > (The two methods (use, only vs import) are equivalent in terms of impact to > > the dictionary.) > > > > Is this compiler ~feature~ something that we intend to work around? Thoughts? > > Best regards, > > Jacob Faibussowitsch > (Jacob Fai - booss - oh - vitch) > Cell: (312) 694-3391 > > > Begin forwarded message: > > > > From: "Roy Musselman" <[email protected]> > > Subject: Re: Case TS005062693 - XLF: ICE in xlfentry compiling a module > > with 358 subroutines > > Date: March 3, 2021 at 08:23:17 CST > > To: Jacob Faibussowitsch <[email protected]> > > Cc: "Gyllenhaal, John C." <[email protected]> > > > > Hi Jacob, > > I tried the first suggestion and commented out the use statements called > > within the functions. However, I hit the following error complaining about > > specific symbol dependencies provided by the library. > > > > .../src/vec/f90-mod/petscvecmod.F90", line 107.37: 1514-084 (S) Identifier > > a is being declared with type name tvec which has not been defined in a > > derived type definition. > > > > IBM provided the following additional explanation and example. Can the > > process used to generate these routines and functions determine the > > specific symbols required and then use the only keyword or import statement > > to include them? > > > > When factoring out use statements out of module procedures, you can just > > delete them. But you can't completely remove them from interface blocks. > > Instead, you can limit them either by using use <module>, only: <symbol> or > > import <symbol> . if the hundreds of use statements in the program are > > factored out / limited in this way, that should reduce the dictionary size > > sufficiently for the program to compile. > > > > For example > > Interface > > Subroutine VecRestoreArrayReadF90(v,array,ierr) > > use petscvecdef > > real(kind=selected_real_kind(10)), pointer :: array(:) > > integer(kind=selected_int_kind(5)) ierr > > type(tVec) v > > End Subroutine > > End Interface > > > > imports all symbols from petscvecdef into the dictionary even though we > > only need tVec . So we can either: > > > > Interface > > Subroutine VecRestoreArrayReadF90(v,array,ierr) > > use petscvecdef, only: tVec > > implicit none > > real(kind=selected_real_kind(10)), pointer :: array(:) > > integer(kind=selected_int_kind(5)) ierr > > type(tVec) v > > End Subroutine > > End Interface > > > > or if use petscvecdef is used in the outer scope, we can: > > Interface > > Subroutine VecRestoreArrayReadF90(v,array,ierr) > > import tVec > > implicit none > > real(kind=selected_real_kind(10)), pointer :: array(:) > > integer(kind=selected_int_kind(5)) ierr > > type(tVec) v > > End Subroutine > > End Interface > > (The two methods (use, only vs import) are equivalent in terms of impact to > > the dictionary.) > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Roy Musselman > > IBM HPC Application Analyst at Lawrence Livermore National Lab > > email: [email protected] <mailto:[email protected]> > > LLNL office: 925-422-6033 > > Cell: 507-358-8895, Home: 507-281-9565 > > > > Roy Musselman---02/24/2021 07:08:45 PM---Hi Jacob, I opened the ticket with > > IBM: case TS005062693 and and the local LLNL Sierra Jira Ticket > > > > From: Roy Musselman/Rochester/Contr/IBM > > To: Jacob Faibussowitsch <[email protected] > > <mailto:[email protected]>> > > Cc: "Gyllenhaal, John C." <[email protected] > > <mailto:[email protected]>> > > Date: 02/24/2021 07:08 PM > > Subject: Re: [EXTERNAL] Case TS005062693 - XLF: ICE in xlfentry compiling > > a module with 358 subroutines > > > > > > > > Hi Jacob, > > I opened the ticket with IBM: case TS005062693 and and the local LLNL > > Sierra Jira Ticket at > > https://lc.llnl.gov/jira/projects/SIERRA/issues/SIERRA-111?filter=allissues > > <https://urldefense.com/v3/__https://lc.llnl.gov/jira/projects/SIERRA/issues/SIERRA-111?filter=allissues__;!!DZ3fjg!vDUpTg4q6jg1lQwt37jm9Uzc7MqGrEdrg0wpKgGq9P5JoR3jKrqncOAKyni2BEUYOxQ$> > > > > Today IBM provided the response below. I don't know when I'll have time to > > try it on the reproducer I gave IBM. Perhaps early next week. Can you > > review this and see if it helps? > > > > The reason for the Internal Compiler Error is because we can't grow an > > interal dictionary anymore (ie we hit a 2**21 limit). > > The file contains many module procedures and interfaces that use the same > > helper module. As a result, we are importing the dictionary entries for > > that module repeatedly reaching > > the limit. > > > > Can you please give the following source code workaround a try? > > Since there is already "use petscvecdefdummy" at the module scope, one > > workaround might be to remove the unnecessary "use petscvecdefdummy" in > > vecnotequal and vecequals > > and all similar procedures. > > > > For example, the test case has: > > module petscvecdef > > use petscvecdefdummy > > ... > > function vecnotequal(A,B) > > use petscvecdefdummy > > logical vecnotequal > > type(tVec), intent(in) :: A,B > > vecnotequal = (A%v .ne. B%v) > > end function > > function vecequals(A,B) > > use petscvecdefdummy > > logical vecequals > > type(tVec), intent(in) :: A,B > > vecequals = (A%v .eq. B%v) > > end function > > ... > > end module > > Another workaround would be to put the procedure definitions from this > > large module into several submodules. Each submodule would be able to > > accommodate a dictionary with 2**21 entries. > > > > > > Please let us know if one of the above workarounds resolve the issue. > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Roy Musselman > > IBM HPC Application Analyst at Lawrence Livermore National Lab > > email: [email protected] <mailto:[email protected]> > > LLNL office: 925-422-6033 > > Cell: 507-358-8895, Home: 507-281-9565 > > > > > > Roy Musselman---02/21/2021 09:42:55 PM---Hi Jacob, After some more > > experimentation, I think I may have found what is triggering the ICE. It > > > > From: Roy Musselman/Rochester/Contr/IBM > > To: Jacob Faibussowitsch <[email protected] > > <mailto:[email protected]>> > > Cc: "Gyllenhaal, John C." <[email protected] > > <mailto:[email protected]>> > > Date: 02/21/2021 09:42 PM > > Subject: Re: [EXTERNAL] Re: xlf90_r Internal Compiler Error > > > > > > Hi Jacob, > > > > After some more experimentation, I think I may have found what is > > triggering the ICE. It doesn't appear to be related to the subroutine name > > length. I think the compiler may be hitting an internal limit of the number > > of subroutines within a module. There are 358 subroutines contained in the > > expanded petscmatmod.F90. Removing 4 subroutines will allow the compile to > > complete successfully, so the limit must be 354 subroutines. Is it possible > > for you to bust up petscmatmod into multiple modules? I'll package up the > > reproducer and pass it on to the compiler development team. > > > > I've asked for user feedback a couple years ago, when the IBM Power9 > > CORAL-1 Sierra systems were deployed, but received minimal responses. DOE > > is now working with Cray (aka HPE) developing the environment for the > > CORAL-2 system (El Capitan). I'll pass your request to the LLNL person I > > know that is dealing with math libraries for CORAL-2. > > > > We use the spack tool to download and build petsc and its specified > > dependencies. I switched between the PETSC versions by changing the > > PETSCDIR variable in the script I shared with you. I've attached a tar ball > > containing the scripts used to build PETSc via spack. > > > > [attachment "bld-petsc-spack.tgz" deleted by Roy > > Musselman/Rochester/Contr/IBM] > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Roy Musselman > > IBM HPC Application Analyst at Lawrence Livermore National Lab > > email: [email protected] <mailto:[email protected]> > > LLNL office: 925-422-6033 > > Cell: 507-358-8895, Home: 507-281-9565 > > > > > > Jacob Faibussowitsch ---02/21/2021 12:24:11 PM---Hi Roy, > I'm not sure > > which projects at LLNL are using PETSc or if they chose to build their own > > ve > > > > From: Jacob Faibussowitsch <[email protected] > > <mailto:[email protected]>> > > To: Roy Musselman <[email protected] <mailto:[email protected]>> > > Cc: "Gyllenhaal, John C." <[email protected] > > <mailto:[email protected]>> > > Date: 02/21/2021 12:24 PM > > Subject: [EXTERNAL] Re: xlf90_r Internal Compiler Error > > > > > > > > Hi Roy, I'm not sure which projects at LLNL are using PETSc or if they > > chose to build their own version. Entirely unrelated to our problem, but is > > it possible to find this out? It would be great if yes, but also completely > > fine if not. PETSc > > Hi Roy, > > I'm not sure which projects at LLNL are using PETSc or if they chose to > > build their own version. > > Entirely unrelated to our problem, but is it possible to find this out? It > > would be great if yes, but also completely fine if not. PETSc is > > potentially undergoing a rather transformative rewrite over the next few > > years and we’d like to gather current usage data to get a better idea of > > where PETSc fits into our users workflows. But we aren’t sure how to gather > > this data (we don’t particularly want to scrape and silently send it off > > without users consent/knowledge) absent user questionnaires and HPC usage > > statistics. > > If you are interested, I can share with you the spack recipes I use to > > build petsc with hdf5, hypre, and suplerlu-dist. > > Yes that would be quite useful. I can let it percolate through our dev > > channels for any other recommendations etc. > > 3.14.0 and 3.14.1 > > > > "../roymuss/spack-stage-petsc-3.14.0-on3lboy4slkz65tsjttgfmwghzky54jj/spack-src/src/vec/f90-mod/petscvecmod.F90", > > line 9.13: 1514-219 (S) Unable to access module symbol file for module > > petscisdefdummy. Check path and file permissions of file. Use association > > not done for this module. > > 1501-511 Compilation failed for file petscvecmod.F90. > > How exactly did you switch between versions? PETSc has 2 types of fortran > > bindings, “ftn-custom” and “ftn-auto” (technically 3 including the F90 > > files, but those simply call either of the two preceding ones), a copy of > > which you will find in every src directory. As the names imply ftn-auto is > > auto generated while ftn-custom is hand-written. > > > > This also means that the ftn-auto files are __not__ tracked by git, so a > > simple git checkout [new-tag] may not properly dispose of the old > > auto-generated files (very rare, but IIRC we made a major enough change to > > the fortran bindings within the last year to warrant having to "make > > deletefortranstubs" before rebuilding). > > Adding the option -qlanglvl=2003std or -qlanglvl=2008std produces a bunch > > of other warning messages, but it still encounters the ICE. So, I'm > > uncertain if the subroutine name length is the root of the problem. > > Our current compiler flag selection philosophy is to require a minimum but > > choose the maximum available reasonable flag for the compiler (I.e. we > > require C99, but very often you will find that your code is compiled with > > C11 or C17 if they are available). It is therefore odd that configure did > > not use the same methodology for fortran compilers. I will relay this on > > our side. > > Is it possible for you to use subroutines that are less than 32 characters > > and see if that works four you? Have you used other fortran 90 compilers > > and do any of them complain of this? > > Of all of the small quirks fortran has this is probably the most esoteric > > one I’ve come across… I’ve attached a list of all the F90 compilers, and > > their flags which we use in CI/CD (all of which is run multiple times daily > > and __must__ pass). I got them all via grep, so there may be some > > duplicates here or there. As for using shorter names, this is also > > something we can look at, but since none of the other compilers have had > > issues with this I’m not sure this is the change to make. > > Are there any unusual or questionable language constructs used in any of > > the functions mentioned above that may possibly challenge the compiler? > > Not that I am aware of, but again I will ask around our dev channels and > > see if anything comes to mind. > > > > > > Best regards, > > > > Jacob Faibussowitsch > > (Jacob Fai - booss - oh - vitch) > > Cell: (312) 694-3391[attachment "compilerList" deleted by Roy > > Musselman/Rochester/Contr/IBM] > > On Feb 20, 2021, at 22:05, Roy Musselman <[email protected] > > <mailto:[email protected]>> wrote: > > Hi Jacob, > > Thanks for letting me know that you are a PETSc developer and that you are > > testing it on the LLNL lassen system. I've used the spack build tool to > > build and deploy a few versions on the systems. I'm not sure which projects > > at LLNL are using PETSc or if they chose to build their own version. I did > > however provide a single precision version upon request that was integrated > > with MVAPICH2-MPI instead of the IBM-provided Spectrum-MPI. Here's what's > > available on the systems today. > > > > > ml avail petsc > > ----------------------------------------------------- > > /usr/tcetmp/modulefiles/Core > > ----------------------------------------------------- > > petsc/default petsc/3.10.2 petsc/3.11.3 petsc/3.13.0 (D) > > petsc/3.13.1-mvapich2-2020.01.09-xl-2020.03.18.single > > > > If you are interested, I can share with you the spack recipes I use to > > build petsc with hdf5, hypre, and suplerlu-dist. > > > > After several attempts I was able to reproduce the Internal Compiler Errro > > (ICE) that you are seeing using version 3.14.4. I've whittled it down to > > the petscmatmod.F90 file and it's specific dependencies. > > The following script is what I'm using. Note that in the 2nd set of > > compiles, the -E option is used to expand all included source files and > > headers and encapsulating it into a single large source file. This can be > > used to help isolate the source of the problem. > > > > #!/bin/bash > > > > PETSCDIR="../roymuss/spack-stage-petsc-3.14.4-eh5arny7l3cqjlltlfpjp6f4jofbnmz6/spack-src" > > > > OPTIONS=" -qmoddir=moddir -I$PETSCDIR/arch-linux-c-opt/include > > -I$PETSCDIR/include" > > mkdir -p moddir > > > > set -x > > > > # Compile original source files including dependencies > > if [ 0 = 1 ]; then > > mpif90 -c -g $OPTIONS $PETSCDIR/src/sys/f90-mod/petscsysmod.F90 -o > > petscsysmod.o > > mpif90 -c -g $OPTIONS $PETSCDIR/src/vec/f90-mod/petscvecmod.F90 -o > > petscvecmod.o > > mpif90 -c -g $OPTIONS $PETSCDIR/src/mat/f90-mod/petscmatmod.F90 -o > > petscmatmod.o > > fi > > > > # Use -E option to expand source into full source files > > if [ 0 = 1 ]; then > > mpif90 -c -g -E $OPTIONS $PETSCDIR/src/sys/f90-mod/petscsysmod.F90 -o > > full_petscsysmod.F90 > > mpif90 -c -g -E $OPTIONS $PETSCDIR/src/vec/f90-mod/petscvecmod.F90 -o > > full_petscvecmod.F90 > > mpif90 -c -g -E $OPTIONS $PETSCDIR/src/mat/f90-mod/petscmatmod.F90 -o > > full_petscmatmod.F90 > > fi > > > > # Compile from full source files > > if [ 1 = 1 ]; then > > mpif90 -c -g -Imoddir -qmoddir=moddir full_petscsysmod.F90 -o > > full_petscsysmod.o > > mpif90 -c -g -Imoddir -qmoddir=moddir full_petscvecmod.F90 -o > > full_petscvecmod.o > > mpif90 -V -c -g -Imoddir -qmoddir=moddir full_petscmatmod.F90 -o > > full_petscmatmod.o > > fi > > > > <eof> > > > > Petsc 3.13.6 it the most recent version that did not fail. I tried all > > subsequent versions and got the folowing results: > > > > 3.14.0 and 3.14.1 > > > > "../roymuss/spack-stage-petsc-3.14.0-on3lboy4slkz65tsjttgfmwghzky54jj/spack-src/src/vec/f90-mod/petscvecmod.F90", > > line 9.13: 1514-219 (S) Unable to access module symbol file for module > > petscisdefdummy. Check path and file permissions of file. Use association > > not done for this module. > > 1501-511 Compilation failed for file petscvecmod.F90. > > > > 3.14.2, 3.14.3, and 3.14.4 > > > > . . . > > ** matnullspaceequals === End of Compilation 8 === > > *** Error in `/usr/tce/packages/xl/xl-2020.11.12/xlf/16.1.1/exe/xlfentry': > > free(): invalid pointer: 0x0000200001740018 *** > > > > Examining the tail end of petscmatmod.F90 > > > > > > 80 function matnullspaceequals(A,B) > > 81 use petscmatdefdummy > > 82 logical matnullspaceequals > > 83 type(tMatNullSpace), intent(in) :: A,B > > 84 matnullspaceequals = (A%v .eq. B%v) > > 85 end function > > 86 > > 87 #if defined(_WIN32) && defined(PETSC_USE_SHARED_LIBRARIES) > > 88 !DEC$ ATTRIBUTES DLLEXPORT::matnotequal > > 89 !DEC$ ATTRIBUTES DLLEXPORT::matequals > > 90 !DEC$ ATTRIBUTES DLLEXPORT::matfdcoloringnotequal > > 91 !DEC$ ATTRIBUTES DLLEXPORT::matfdcoloringequals > > 92 !DEC$ ATTRIBUTES DLLEXPORT::matnullspacenotequal > > 93 !DEC$ ATTRIBUTES DLLEXPORT::matnullspaceequals > > 94 #endif > > 95 module petscmat > > 96 use petscmatdef > > 97 use petscvec > > 98 #include <../src/mat/f90-mod/petscmat.h90> > > 99 interface > > 100 #include <../src/mat/f90-mod/ftn-auto-interfaces/petscmat.h90> > > 101 end interface > > 102 end module > > 103 > > > > Compiling the matnullspaceequals function was successful just before > > hitting the error. The error goes away when removing either or both of the > > #include lines 98 and 100. Both #include statements are required to produce > > the error. The 3.13.6 and 3.14.4 version of the file identified in the > > first #include at line 98 are identical. The file identified in line 100 is > > different between 3.13.6 and 3.14.4. > > Just looking at the list of subroutines contained within each version, the > > following are the differences. > > > > Old subroutines available in 3.13.6 but removed from 4.14.4 > > subroutine MatFreeIntermediateDataStructures(a,z) > > > > New subroutines available in 4.14.4 but not contained in 3.13.6 > > subroutine MatDenseReplaceArray(a,b,z) > > subroutine MatIsShell(a,b,z) > > subroutine MatRARtMultEqual(a,b,c,d,e,z) > > subroutine MatScaLAPACKGetBlockSizes(a,b,c,z) > > subroutine MatScaLAPACKSetBlockSizes(a,b,c,z) > > subroutine MatSeqAIJCUSPARSESetGenerateTranspose(a,b,z) > > subroutine MatSeqAIJSetTotalPreallocation(a,b,z) > > subroutine MatSetLayouts(a,b,c,z) > > > > Methodically removing the new subroutines did not provide a consistent > > result. But I did notice the extra long subroutine name > > MatSeqAIJCUSPARSESetGenerateTranspose had 37 characters. > > A little research found: In Fortran 90/95 the maximum length was 31 > > characters, in Fortran 2003 it is now 63 characters. I found the following > > subroutines with greater than 31 characters > > > > subroutine MatCreateMPIMatConcatenateSeqMat > > subroutine MatFactorFactorizeSchurComplement > > subroutine MatMPIAdjCreateNonemptySubcommMat > > subroutine MatSeqAIJCUSPARSESetGenerateTranspose > > subroutine MatMPIAIJSetUseScalableIncreaseOverlap > > subroutine MatFactorSolveSchurComplementTranspose > > > > I individually ifdef'd them out of the source file and was able to compile > > the files successfully without encountering the ICE. > > > > I'm not exactly sure what the maximum subroutine name length that the XLF > > compiler allows, but if it is only 31, it would be useful if the compiler > > detected this and issue a message instead of the ICE. > > Adding the option -qlanglvl=2003std or -qlanglvl=2008std produces a bunch > > of other warning messages, but it still encounters the ICE. So, I'm > > uncertain if the subroutine name length is the root of the problem. > > > > Is it possible for you to use subroutines that are less than 32 characters > > and see if that works four you? Have you used other fortran 90 compilers > > and do any of them complain of this? > > Are there any unusual or questionable language constructs used in any of > > the functions mentioned above that may possibly challenge the compiler? > > > > I'll package this up and send it to the IBM XL compiler development team > > for their examination and comment. > > > > Best Regards, > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Roy Musselman > > IBM HPC Application Analyst at Lawrence Livermore National Lab > > email: [email protected] <mailto:[email protected]> > > LLNL office: 925-422-6033 > > Cell: 507-358-8895, Home: 507-281-9565 > > > > <graycol.gif>Jacob Faibussowitsch ---02/18/2021 02:17:05 PM---> The most > > recently built version available on the CORAL systems is 3.13.0. (ml load > > petsc/3.13.0) W > > > > From: Jacob Faibussowitsch <[email protected] > > <mailto:[email protected]>> > > To: Roy Musselman <[email protected] <mailto:[email protected]>> > > Cc: "Gyllenhaal, John C." <[email protected] > > <mailto:[email protected]>> > > Date: 02/18/2021 02:17 PM > > Subject: [EXTERNAL] Re: xlf90_r Internal Compiler Error > > > > > > > > > > > > The most recently built version available on the CORAL systems... > > This Message Is From an External Sender > > This message came from outside your organization. > > The most recently built version available on the CORAL systems is 3.13.0. > > (ml load petsc/3.13.0) Will that work for you? > > I am building petsc from source as part of development work on petsc itself > > so modules are unfortunately not useful here. > > The files you sent me do not contain all the dependencies (other mod files) > > required to reproduce the error. > > I'll attempt to build version 3.14.4 from scratch and recreate the failing > > symptom you are observing. > > Yes, petsc uses an automated system to generate the fortran files from C > > which goes about 20 rabbit holes deeper than I was willing to dig. Let me > > know if you run into trouble configuring and building petsc, I can point > > you in the right direction. I’ve attached a “reconfigure” script with this > > email, it contains all of the arguments I used to configure petsc > > successfully on Lassen. If you place it into your $PETSC_DIR (i.e. the > > folder titled “petsc” and that contains a “configure” file) and run: > > > > $ python3 ./reconfigure-arch-linux-c-debug.py > > > > It should work. If not, you will have to > > > > $ ./configure —all-the-args —in-the-reconfigure —file > > > > Best regards, > > > > Jacob Faibussowitsch > > (Jacob Fai - booss - oh - vitch) > > Cell: (312) 694-3391[attachment "reconfigure-arch-linux-c-debug.py" deleted > > by Roy Musselman/Rochester/Contr/IBM] > > On Feb 18, 2021, at 15:07, Roy Musselman <[email protected] > > <mailto:[email protected]>> wrote: > > Hi Jacob, > > > > The source file appears to come from the PETSc 3.14.4 library. The most > > recently built version available on the CORAL systems is 3.13.0. (ml load > > petsc/3.13.0) Will that work for you? > > The files you sent me do not contain all the dependencies (other mod files) > > required to reproduce the error. > > I'll attempt to build version 3.14.4 from scratch and recreate the failing > > symptom you are observing. > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Roy Musselman > > IBM HPC Application Analyst at Lawrence Livermore National Lab > > email: [email protected] <mailto:[email protected]> > > LLNL office: 925-422-6033 > > Cell: 507-358-8895, Home: 507-281-9565 > > > > <graycol.gif>Roy Musselman---02/18/2021 11:18:20 AM---I'll take a look. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Roy Musselman > > > > From: Roy Musselman/Rochester/Contr/IBM > > To: LC Hotline <[email protected] <mailto:[email protected]>> > > Cc: "Gyllenhaal, John C." <[email protected] > > <mailto:[email protected]>> > > Date: 02/18/2021 11:18 AM > > Subject: Re: [EXTERNAL] FW: xlf90_r Internal Compiler Error > > > > > > > > > > > > I'll take a look. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Roy Musselman > > IBM HPC Application Analyst at Lawrence Livermore National Lab > > email: [email protected] <mailto:[email protected]> > > LLNL office: 925-422-6033 > > Cell: 507-358-8895, Home: 507-281-9565 > > > > > > <graycol.gif>LC Hotline ---02/18/2021 11:03:55 AM---Hi John, Roy, Can you > > help this user with the problem that he is seeing when he tries to build > > with > > > > From: LC Hotline <[email protected] <mailto:[email protected]>> > > To: "Gyllenhaal, John C." <[email protected] > > <mailto:[email protected]>>, Roy Musselman <[email protected] > > <mailto:[email protected]>> > > Date: 02/18/2021 11:03 AM > > Subject: [EXTERNAL] FW: xlf90_r Internal Compiler Error > > > > > > > > Hi John, Roy, Can you help this user with the problem that he is... > > This Message Is From an External Sender > > This message came from outside your organization. > > Hi John, Roy, > > > > Can you help this user with the problem that he is seeing when he tries to > > build with xlf90 on Lassen? > > > > Thanks, > > Ryan > > -- > > LC Hotline > > > > From: Jacob Faibussowitsch <[email protected] > > <mailto:[email protected]>> > > Date: Wednesday, February 17, 2021 at 5:27 PM > > To: LC Hotline <[email protected] <mailto:[email protected]>> > > Subject: xlf90_r Internal Compiler Error > > > > Hello LC Support, > > > > While compiling my application on Lassen I seem have run afoul of the xlf90 > > mpi compiler wrapper with the following error: > > > > *** Error in `/usr/tce/packages/xl/xl-2020.11.12/xlf/16.1.1/exe/xlfentry': > > free(): invalid pointer: 0x0000200001740018 *** > > > > I’m fairly certain this isn’t my fault as this is code that compiles > > regularly on extensive CI/CD under various other compilers and machines, > > but you can never rule it out. I have included a verbose full log of my > > make run (which includes a comprehensive rundown of the environment) as > > well as a separate file containing the error message and stack trace from > > the compiler. Additionally I have also included the file which I believe is > > causing the error. Let me know if there is anything else I should send. > > > > P.S. My list of loaded modules: > > > > Currently Loaded Modules: > > 1) StdEnv (S) 4) cuda/11.1.1 7) valgrind/3.16.1 > > 2) clang/ibm-11.0.0 5) python/3.8.2 8) lapack/3.9.0-xl-2020.11.12 > > 3) spectrum-mpi/rolling-release 6) cmake/3.18.0 9) hip/3.0.0 > > > > Best regards, > > > > Jacob Faibussowitsch > > (Jacob Fai - booss - oh - vitch) > > Cell: (312) 694-3391[attachment "errorReport.zip" deleted by Roy > > Musselman/Rochester/Contr/IBM] > >
