Daniel and Doug, We did some experimentation with HPCToolkit yesterday. To fix your problems with the analysis of Fortran binaries, you need to install a new ‘develop’ version of HPCToolkit with Dyninst master. The following complete recipe should work for CPU-only codes:
git clone https://github.com/spack/spack source spack/share/spack/setup-env.sh spack compiler find spack install hpctoolkit@develop ^dyninst@master spack load hpctoolkit To work with GPU codes requires a bit of fiddling with a spack packages.yaml file to indicate where GPU components can be found, as documented here: https://hpctoolkit.org/software-instructions.html FYI: The need to change from hpctoolkit@2024.01.1 to hpctoolkit@develop is only because the API to Dyninst has changed since the 2024.01.1 snapshot. The older hpctoolkit won’t build with the newer dyninst. Best, John -- John Mellor-Crummey Professor Dept of Computer Science Rice University email: joh...@rice.edu phone: 713-348-5179 > On May 12, 2025, at 1:37 PM, John Mellor-Crummey <joh...@rice.edu> wrote: > > Using Daniel’s gfx_model.x binary, I confirmed > • > (bad) that hpcstruct in hpctoolkit version 2024.01.1 based on Dyninst 13.0.0 > fails with binary > • (good) the Dyninst problem for analyzing DWARF subrange information > from Fortran applications has been fixed in Dyninst master. > > > Unfortunately, Dyninst master is not usable with the HPCToolkit 2024.01.1 > release. However, the updated version of Dyninst is usable with HPCToolkit’s > develop branch. Unfortunately, the spack recipe for deploying our develop > branch seems to be missing a few library paths that don’t get baked in by > spack. I will report back to this list when we have fixed HPCToolkit's spack > recipe so you can use our develop branch. > > Best, > > John > -- > John Mellor-Crummey Professor > Dept of Computer Science Rice University > email: joh...@rice.edu phone: 713-348-5179 > > > >> On May 12, 2025, at 10:26 AM, Daniel Kokron - NOAA Affiliate >> <daniel.kok...@noaa.gov> wrote: >> >> Ahhhh, that explains the following and how to get around it. Thank you. >> >> WARNING: Skipping DWARF for gfs_model.x, over threshold (377978416 > >> 104857600) >> >> On Mon, May 12, 2025 at 10:13 AM John Mellor-Crummey <joh...@rice.edu> wrote: >> Daniel, >> >> One more thing: >> >> While we work on resolving the issue with hpcstruct, you should be able to >> run hpcprof on your measurement data even if hpcstruct failed to analyze >> this binary. hpcprof includes the ability to read DWARF (using a different >> library that shouldn’t crash). >> >> When you run hpcprof, you should use >> >> hpcprof --dwarf-max-size=unlimited <measurement directory> >> >> Best, >> >> John >> -- >> John Mellor-Crummey Professor >> Dept of Computer Science Rice University >> email: joh...@rice.edu phone: 713-348-5179 >> >> >> >>> On May 12, 2025, at 10:05 AM, Daniel Kokron - NOAA Affiliate >>> <daniel.kok...@noaa.gov> wrote: >>> >>> Got permission to share the executable. Link sent. >>> >>> On Fri, May 9, 2025 at 2:26 PM Daniel Kokron - NOAA Affiliate >>> <daniel.kok...@noaa.gov> wrote: >>> I'll ask about providing the executable. >>> >>> On Fri, May 9, 2025 at 1:52 PM John Mellor-Crummey <joh...@rice.edu> wrote: >>> Hi Daniel, >>> >>> Thanks for the callstack. >>> >>> The problem seems to be exactly the same one recently encountered by Doug >>> Pase for a Fortran program at Sandia. This is a problem inside the type >>> processing by the Dyninst software written by our collaborators. >>> >>> Can you share a binary with us to facilitate debugging? The Sandia binary >>> is export controlled and only accessible inside their firewall. Having a >>> non-export controlled binary for debugging would make our lives easier. >>> >>> Best, >>> >>> John >>> -- >>> John Mellor-Crummey Professor >>> Dept of Computer Science Rice University >>> email: joh...@rice.edu phone: 713-348-5179 >>> >>> >>> >>>> On May 9, 2025, at 12:15 PM, Daniel Kokron - NOAA Affiliate >>>> <daniel.kok...@noaa.gov> wrote: >>>> >>>> The application is compiled with Intel ifort. HPCToolkit and its >>>> dependencies are compiled with gcc-13.2.1. I attached the spec for >>>> HPCToolkit. >>>> >>>> >>>> (gdb) run --nocache >>>> /lfs/h1/hpc/support/daniel.kokron/Tickets/2025042910000034/sorc/ufs_model.fd/build_fv3_1/gfs_model.x >>>> Starting program: >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/hpctoolkit-2024.01.1-a3im66mlumyu3hbzmeuor3kj3l553yau/bin/hpcstruct >>>> --nocache >>>> /lfs/h1/hpc/support/daniel.kokron/Tickets/2025042910000034/sorc/ufs_model.fd/build_fv3_1/gfs_model.x >>>> Missing separate debuginfos, use: zypper install >>>> glibc-debuginfo-2.31-150300.63.1.x86_64 >>>> Missing separate debuginfo for >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libstdc++.so.6 >>>> Try: zypper install -C >>>> "debuginfo(build-id)=c74eca671e2dd0f063706372d103f8acef88f1e3" >>>> Missing separate debuginfo for >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libgomp.so.1 >>>> Try: zypper install -C >>>> "debuginfo(build-id)=54684492738e640bcd600e830cee025dd8771a20" >>>> Missing separate debuginfo for >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libgcc_s.so.1 >>>> Try: zypper install -C >>>> "debuginfo(build-id)=12f775ec4aeb94b749897b1b65638f18b61d1b1f" >>>> [Thread debugging using libthread_db enabled] >>>> Using host libthread_db library "/lib64/libthread_db.so.1". >>>> begin sequential analysis of CPU binary >>>> /lfs/h1/hpc/support/daniel.kokron/Tickets/2025042910000034/sorc/ufs_model.fd/build_fv3_1/gfs_model.x >>>> (size = 377978672, threads = 1) >>>> hpcstruct: >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/boost-1.87.0-2cldxfpwec5rbbhxutja5lwcgzh6fbhc/include/boost/smart_ptr/shared_ptr.hpp:550: >>>> typename boost::detail::sp_member_access<T>::type >>>> boost::shared_ptr<T>::operator->() const [with T = >>>> Dyninst::SymtabAPI::typeSubrange; typename >>>> boost::detail::sp_member_access<T>::type = >>>> Dyninst::SymtabAPI::typeSubrange*]: Assertion `px != 0' failed. >>>> >>>> Program received signal SIGABRT, Aborted. >>>> 0x0000155553e2fd2b in raise () from /lib64/libc.so.6 >>>> (gdb) where >>>> #0 0x0000155553e2fd2b in raise () from /lib64/libc.so.6 >>>> #1 0x0000155553e313e5 in abort () from /lib64/libc.so.6 >>>> #2 0x0000155553e27c6a in __assert_fail_base () from /lib64/libc.so.6 >>>> #3 0x0000155553e27cf2 in __assert_fail () from /lib64/libc.so.6 >>>> #4 0x0000155554d65127 in boost::enable_if<boost::integral_constant<bool, >>>> !((bool)boost::is_same<Dyninst::SymtabAPI::Type, >>>> Dyninst::SymtabAPI::typeSubrange>::value)>, >>>> boost::shared_ptr<Dyninst::SymtabAPI::Type> >::type >>>> Dyninst::SymtabAPI::typeCollection::addOrUpdateType<Dyninst::SymtabAPI::typeSubrange>(boost::shared_ptr<Dyninst::SymtabAPI::typeSubrange>) >>>> () from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #5 0x0000155554d547e6 in Dyninst::SymtabAPI::DwarfWalker::parseSubrange() >>>> () from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #6 0x0000155554d5a0a8 in >>>> Dyninst::SymtabAPI::DwarfWalker::parse_int(Dwarf_Die, bool, bool) () from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #7 0x0000155554d5b235 in >>>> Dyninst::SymtabAPI::DwarfWalker::findAnyType(Dwarf_Attribute, bool, >>>> boost::shared_ptr<Dyninst::SymtabAPI::Type>&) () >>>> from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #8 0x0000155554d5b732 in >>>> Dyninst::SymtabAPI::DwarfWalker::findType(boost::shared_ptr<Dyninst::SymtabAPI::Type>&, >>>> bool) () >>>> from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #9 0x0000155554d5497b in Dyninst::SymtabAPI::DwarfWalker::parseArray() () >>>> from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #10 0x0000155554d59fb8 in >>>> Dyninst::SymtabAPI::DwarfWalker::parse_int(Dwarf_Die, bool, bool) () from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #11 0x0000155554d5a53b in >>>> Dyninst::SymtabAPI::DwarfWalker::parse_int(Dwarf_Die, bool, bool) () from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #12 0x0000155554d5bab0 in >>>> Dyninst::SymtabAPI::DwarfWalker::parseModule(Dwarf_Die, >>>> Dyninst::SymtabAPI::Module*&) [clone .constprop.0] [clone .isra.0] () >>>> from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #13 0x0000155554d5c15c in Dyninst::SymtabAPI::DwarfWalker::parse() [clone >>>> ._omp_fn.0] () from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #14 0x000015555403b306 in GOMP_parallel () from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libgomp.so.1 >>>> #15 0x0000155554d5d2ed in Dyninst::SymtabAPI::DwarfWalker::parse() () from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #16 0x0000155554d0a0c1 in Dyninst::SymtabAPI::Object::parseTypeInfo() () >>>> from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #17 0x0000155554cd48a7 in Dyninst::SymtabAPI::Symtab::parseTypes() () from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #18 0x0000155553fef5d7 in __pthread_once_slow () from >>>> /lib64/libpthread.so.0 >>>> #19 0x0000155554ccc8b4 in Dyninst::SymtabAPI::Symtab::parseTypesNow() () >>>> from >>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0 >>>> #20 0x00000000004418c8 in Inline::openSymtab >>>> (elfFile=elfFile@entry=0x8d94b0) at Struct-Inline.cpp:132 >>>> #21 0x000000000043cb31 in BAnal::Struct::makeStructure (filename=..., >>>> outFile=outFile@entry=0x830ab0, gapsFile=gapsFile@entry=0x0, >>>> gaps_filenm=..., search_path=..., structOpts=...) at Struct.cpp:770 >>>> #22 0x000000000042cb42 in doSingleBinary (args=..., >>>> sb=sb@entry=0x7ffffffd8740) at >>>> /usr/include/c++/13/bits/basic_string.tcc:238 >>>> #23 0x0000000000412cfd in realmain (argc=<optimized out>, argv=<optimized >>>> out>) at main.cpp:209 >>>> #24 0x000000000041220a in main (argc=<optimized out>, argv=<optimized >>>> out>) at main.cpp:137 >>>> >>>> On Fri, May 9, 2025 at 11:16 AM John Mellor-Crummey <joh...@rice.edu> >>>> wrote: >>>> Hi Daniel, >>>> >>>> You should be able to run hpcstruct under gdb and then run it directly on >>>> the offending binary as follows >>>> >>>> gdb `which hpcstruct` >>>> run --nocache /path/to/gfs_model >>>> >>>> Then, you can send us a call path. By any chance is this a Fortran code >>>> compiled with gfortran? We are presently looking into a complaint about >>>> that from Sandia. >>>> >>>> Best, >>>> >>>> John >>>> -- >>>> John Mellor-Crummey Professor >>>> Dept of Computer Science Rice University >>>> email: joh...@rice.edu phone: 713-348-5179 >>>> >>>> >>>> >>>>> On May 9, 2025, at 8:59 AM, Daniel Kokron - NOAA Affiliate via >>>>> HPCToolkit-forum <hpctoolkit-fo...@mailman.rice.edu> wrote: >>>>> >>>>> I am encountering the following error while running hpcstruct. I cannot >>>>> find the core file in any of the usual places. I have also tried running >>>>> hpcstruct under gdb without getting very far. >>>>> >>>>> Wondering what my debugging options are? >>>>> >>>>> begin concurrent analysis of CPU binary gfs_model. (size = 377978416, >>>>> threads = 1) >>>>> /bin/sh: line 32: 63480 Aborted (core dumped) >>>>> /spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/hpctoolkit-2024.01.1-a3im66mlumyu3hbzmeuor3kj3l553yau/bin/hpcstruct. >>>>> --nocache -j 1 -o $struct_name -M $meas_dir >>>>> /Baseline_6Hr_WithWW3Restarts_Trace.16774.rawdata/cpubins/model.x > >>>>> $warn_name 2>&1 >>>>> >>>>> Dan >>>>> _______________________________________________ >>>>> HPCToolkit-forum mailing list >>>>> hpctoolkit-fo...@mailman.rice.edu >>>>> https://mailman.rice.edu/mailman/listinfo/hpctoolkit-forum >>>> >>> >> > _______________________________________________ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api