Daniel and Doug,

We did some experimentation with HPCToolkit yesterday. To fix your problems 
with the analysis of Fortran binaries, you need to install a new ‘develop’ 
version of HPCToolkit with Dyninst master. The following complete recipe should 
work for CPU-only codes:

git clone https://github.com/spack/spack
source spack/share/spack/setup-env.sh
spack compiler find
spack install hpctoolkit@develop ^dyninst@master
spack load hpctoolkit

To work with GPU codes requires a bit of fiddling with a spack packages.yaml 
file to indicate where GPU components can be found, as documented here: 
https://hpctoolkit.org/software-instructions.html

FYI: The need to change from hpctoolkit@2024.01.1 to hpctoolkit@develop is only 
because the API to Dyninst has changed since the 2024.01.1 snapshot. The older 
hpctoolkit won’t build with the newer dyninst.

Best,

John
--
John Mellor-Crummey         Professor
Dept of Computer Science    Rice University
email: joh...@rice.edu      phone: 713-348-5179



> On May 12, 2025, at 1:37 PM, John Mellor-Crummey <joh...@rice.edu> wrote:
> 
> Using Daniel’s gfx_model.x binary, I confirmed 
>     • 
> (bad) that hpcstruct in hpctoolkit version 2024.01.1 based on Dyninst 13.0.0 
> fails with binary
>     • (good) the Dyninst problem for analyzing DWARF subrange information 
> from Fortran applications has been fixed in Dyninst master.  
> 
> 
> Unfortunately, Dyninst master is not usable with the HPCToolkit 2024.01.1 
> release. However, the updated version of Dyninst is usable with HPCToolkit’s 
> develop branch. Unfortunately, the spack recipe for deploying our develop 
> branch seems to be missing a few library paths that don’t get baked in by 
> spack. I will report back to this list when we have fixed HPCToolkit's spack 
> recipe so you can use our develop branch.
> 
> Best,
> 
> John
> --
> John Mellor-Crummey         Professor
> Dept of Computer Science    Rice University
> email: joh...@rice.edu      phone: 713-348-5179
> 
> 
> 
>> On May 12, 2025, at 10:26 AM, Daniel Kokron - NOAA Affiliate 
>> <daniel.kok...@noaa.gov> wrote:
>> 
>> Ahhhh, that explains the following and how to get around it.  Thank you.
>> 
>> WARNING: Skipping DWARF for gfs_model.x, over threshold (377978416 > 
>> 104857600)
>> 
>> On Mon, May 12, 2025 at 10:13 AM John Mellor-Crummey <joh...@rice.edu> wrote:
>> Daniel,
>> 
>> One more thing:
>> 
>> While we work on resolving the issue with hpcstruct, you should be able to 
>> run hpcprof on your measurement data even if hpcstruct failed to analyze 
>> this binary. hpcprof includes the ability to read DWARF (using a different 
>> library that shouldn’t crash).
>> 
>> When you run hpcprof, you should use
>> 
>> hpcprof --dwarf-max-size=unlimited <measurement directory>
>> 
>> Best,
>> 
>> John
>> --
>> John Mellor-Crummey         Professor
>> Dept of Computer Science    Rice University
>> email: joh...@rice.edu      phone: 713-348-5179
>> 
>> 
>> 
>>> On May 12, 2025, at 10:05 AM, Daniel Kokron - NOAA Affiliate 
>>> <daniel.kok...@noaa.gov> wrote:
>>> 
>>> Got permission to share the executable.  Link sent.
>>> 
>>> On Fri, May 9, 2025 at 2:26 PM Daniel Kokron - NOAA Affiliate 
>>> <daniel.kok...@noaa.gov> wrote:
>>> I'll ask about providing the executable.
>>> 
>>> On Fri, May 9, 2025 at 1:52 PM John Mellor-Crummey <joh...@rice.edu> wrote:
>>> Hi Daniel,
>>> 
>>> Thanks for the callstack.
>>> 
>>> The problem seems to be exactly the same one recently encountered by Doug 
>>> Pase for a Fortran program at Sandia. This is a problem inside the type 
>>> processing by the Dyninst software written by our collaborators. 
>>> 
>>> Can you share a binary with us to facilitate debugging? The Sandia binary 
>>> is export controlled and only accessible inside their firewall. Having a 
>>> non-export controlled binary for debugging would make our lives easier.
>>> 
>>> Best,
>>> 
>>> John
>>> --
>>> John Mellor-Crummey         Professor
>>> Dept of Computer Science    Rice University
>>> email: joh...@rice.edu      phone: 713-348-5179
>>> 
>>> 
>>> 
>>>> On May 9, 2025, at 12:15 PM, Daniel Kokron - NOAA Affiliate 
>>>> <daniel.kok...@noaa.gov> wrote:
>>>> 
>>>> The application is compiled with Intel ifort.  HPCToolkit and its 
>>>> dependencies are compiled with gcc-13.2.1.  I attached the spec for 
>>>> HPCToolkit.
>>>> 
>>>> 
>>>> (gdb) run --nocache 
>>>> /lfs/h1/hpc/support/daniel.kokron/Tickets/2025042910000034/sorc/ufs_model.fd/build_fv3_1/gfs_model.x
>>>> Starting program: 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/hpctoolkit-2024.01.1-a3im66mlumyu3hbzmeuor3kj3l553yau/bin/hpcstruct
>>>>  --nocache 
>>>> /lfs/h1/hpc/support/daniel.kokron/Tickets/2025042910000034/sorc/ufs_model.fd/build_fv3_1/gfs_model.x
>>>> Missing separate debuginfos, use: zypper install 
>>>> glibc-debuginfo-2.31-150300.63.1.x86_64
>>>> Missing separate debuginfo for 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libstdc++.so.6
>>>> Try: zypper install -C 
>>>> "debuginfo(build-id)=c74eca671e2dd0f063706372d103f8acef88f1e3"
>>>> Missing separate debuginfo for 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libgomp.so.1
>>>> Try: zypper install -C 
>>>> "debuginfo(build-id)=54684492738e640bcd600e830cee025dd8771a20"
>>>> Missing separate debuginfo for 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libgcc_s.so.1
>>>> Try: zypper install -C 
>>>> "debuginfo(build-id)=12f775ec4aeb94b749897b1b65638f18b61d1b1f"
>>>> [Thread debugging using libthread_db enabled]
>>>> Using host libthread_db library "/lib64/libthread_db.so.1".
>>>>  begin sequential analysis of CPU binary 
>>>> /lfs/h1/hpc/support/daniel.kokron/Tickets/2025042910000034/sorc/ufs_model.fd/build_fv3_1/gfs_model.x
>>>>  (size = 377978672, threads = 1)
>>>> hpcstruct: 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/boost-1.87.0-2cldxfpwec5rbbhxutja5lwcgzh6fbhc/include/boost/smart_ptr/shared_ptr.hpp:550:
>>>>  typename boost::detail::sp_member_access<T>::type 
>>>> boost::shared_ptr<T>::operator->() const [with T = 
>>>> Dyninst::SymtabAPI::typeSubrange; typename 
>>>> boost::detail::sp_member_access<T>::type = 
>>>> Dyninst::SymtabAPI::typeSubrange*]: Assertion `px != 0' failed.
>>>> 
>>>> Program received signal SIGABRT, Aborted.
>>>> 0x0000155553e2fd2b in raise () from /lib64/libc.so.6
>>>> (gdb) where
>>>> #0  0x0000155553e2fd2b in raise () from /lib64/libc.so.6
>>>> #1  0x0000155553e313e5 in abort () from /lib64/libc.so.6
>>>> #2  0x0000155553e27c6a in __assert_fail_base () from /lib64/libc.so.6
>>>> #3  0x0000155553e27cf2 in __assert_fail () from /lib64/libc.so.6
>>>> #4  0x0000155554d65127 in boost::enable_if<boost::integral_constant<bool, 
>>>> !((bool)boost::is_same<Dyninst::SymtabAPI::Type, 
>>>> Dyninst::SymtabAPI::typeSubrange>::value)>, 
>>>> boost::shared_ptr<Dyninst::SymtabAPI::Type> >::type 
>>>> Dyninst::SymtabAPI::typeCollection::addOrUpdateType<Dyninst::SymtabAPI::typeSubrange>(boost::shared_ptr<Dyninst::SymtabAPI::typeSubrange>)
>>>>  () from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #5  0x0000155554d547e6 in Dyninst::SymtabAPI::DwarfWalker::parseSubrange() 
>>>> () from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #6  0x0000155554d5a0a8 in 
>>>> Dyninst::SymtabAPI::DwarfWalker::parse_int(Dwarf_Die, bool, bool) () from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #7  0x0000155554d5b235 in 
>>>> Dyninst::SymtabAPI::DwarfWalker::findAnyType(Dwarf_Attribute, bool, 
>>>> boost::shared_ptr<Dyninst::SymtabAPI::Type>&) ()
>>>>    from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #8  0x0000155554d5b732 in 
>>>> Dyninst::SymtabAPI::DwarfWalker::findType(boost::shared_ptr<Dyninst::SymtabAPI::Type>&,
>>>>  bool) ()
>>>>    from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #9  0x0000155554d5497b in Dyninst::SymtabAPI::DwarfWalker::parseArray() () 
>>>> from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #10 0x0000155554d59fb8 in 
>>>> Dyninst::SymtabAPI::DwarfWalker::parse_int(Dwarf_Die, bool, bool) () from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #11 0x0000155554d5a53b in 
>>>> Dyninst::SymtabAPI::DwarfWalker::parse_int(Dwarf_Die, bool, bool) () from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #12 0x0000155554d5bab0 in 
>>>> Dyninst::SymtabAPI::DwarfWalker::parseModule(Dwarf_Die, 
>>>> Dyninst::SymtabAPI::Module*&) [clone .constprop.0] [clone .isra.0] ()
>>>>    from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #13 0x0000155554d5c15c in Dyninst::SymtabAPI::DwarfWalker::parse() [clone 
>>>> ._omp_fn.0] () from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #14 0x000015555403b306 in GOMP_parallel () from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/gcc-runtime-13.2.1-eo4evuugdi6s23do65dqomvbknlo4ong/lib/libgomp.so.1
>>>> #15 0x0000155554d5d2ed in Dyninst::SymtabAPI::DwarfWalker::parse() () from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #16 0x0000155554d0a0c1 in Dyninst::SymtabAPI::Object::parseTypeInfo() () 
>>>> from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #17 0x0000155554cd48a7 in Dyninst::SymtabAPI::Symtab::parseTypes() () from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #18 0x0000155553fef5d7 in __pthread_once_slow () from 
>>>> /lib64/libpthread.so.0
>>>> #19 0x0000155554ccc8b4 in Dyninst::SymtabAPI::Symtab::parseTypesNow() () 
>>>> from 
>>>> /lfs/h1/hpc/support/daniel.kokron/SPACK/spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/dyninst-13.0.0-74gjdp5yt432nk3wyv7dn7o45ovdw6hr/lib/libsymtabAPI.so.13.0
>>>> #20 0x00000000004418c8 in Inline::openSymtab 
>>>> (elfFile=elfFile@entry=0x8d94b0) at Struct-Inline.cpp:132
>>>> #21 0x000000000043cb31 in BAnal::Struct::makeStructure (filename=..., 
>>>> outFile=outFile@entry=0x830ab0, gapsFile=gapsFile@entry=0x0, 
>>>> gaps_filenm=..., search_path=..., structOpts=...) at Struct.cpp:770
>>>> #22 0x000000000042cb42 in doSingleBinary (args=..., 
>>>> sb=sb@entry=0x7ffffffd8740) at 
>>>> /usr/include/c++/13/bits/basic_string.tcc:238
>>>> #23 0x0000000000412cfd in realmain (argc=<optimized out>, argv=<optimized 
>>>> out>) at main.cpp:209
>>>> #24 0x000000000041220a in main (argc=<optimized out>, argv=<optimized 
>>>> out>) at main.cpp:137
>>>> 
>>>> On Fri, May 9, 2025 at 11:16 AM John Mellor-Crummey <joh...@rice.edu> 
>>>> wrote:
>>>> Hi Daniel,
>>>> 
>>>> You should be able to run hpcstruct under gdb and then run it directly on 
>>>> the offending binary as follows
>>>> 
>>>> gdb `which hpcstruct`
>>>> run --nocache /path/to/gfs_model
>>>> 
>>>> Then, you can send us a call path. By any chance is this a Fortran code 
>>>> compiled with gfortran? We are presently looking into a complaint about 
>>>> that from Sandia.
>>>> 
>>>> Best,
>>>> 
>>>> John
>>>> --
>>>> John Mellor-Crummey         Professor
>>>> Dept of Computer Science    Rice University
>>>> email: joh...@rice.edu      phone: 713-348-5179
>>>> 
>>>> 
>>>> 
>>>>> On May 9, 2025, at 8:59 AM, Daniel Kokron - NOAA Affiliate via 
>>>>> HPCToolkit-forum <hpctoolkit-fo...@mailman.rice.edu> wrote:
>>>>> 
>>>>> I am encountering the following error while running hpcstruct.  I cannot 
>>>>> find the core file in any of the usual places.  I have also tried running 
>>>>> hpcstruct under gdb without getting very far.
>>>>> 
>>>>> Wondering what my debugging options are?
>>>>> 
>>>>>  begin concurrent analysis of CPU binary gfs_model. (size = 377978416, 
>>>>> threads = 1)
>>>>> /bin/sh: line 32: 63480 Aborted                 (core dumped) 
>>>>> /spack/opt/spack/linux-sles15-zen2/gcc-13.2.1/hpctoolkit-2024.01.1-a3im66mlumyu3hbzmeuor3kj3l553yau/bin/hpcstruct.
>>>>>  --nocache -j 1 -o $struct_name -M $meas_dir 
>>>>> /Baseline_6Hr_WithWW3Restarts_Trace.16774.rawdata/cpubins/model.x > 
>>>>> $warn_name 2>&1
>>>>> 
>>>>> Dan
>>>>> _______________________________________________
>>>>> HPCToolkit-forum mailing list
>>>>> hpctoolkit-fo...@mailman.rice.edu
>>>>> https://mailman.rice.edu/mailman/listinfo/hpctoolkit-forum
>>>> 
>>> 
>> 
> 


_______________________________________________
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api

Reply via email to