Hi Ole, this looks similar to what I've been doing for building RPMs. (It's documented for our in-house branch at [1], if anyone wants to compare.) Happy to see I'm not doing something totally stupid. :)
[1] https://github.com/UCL-ARC/nhc/blob/ucl/README.md Thanks, Frank -- Dr. Frank Otto Principal Research Infrastructure Developer Advanced Research Computing Centre Univesity College London, UK ________________________________ From: Ole Holm Nielsen via slurm-users <slurm-users@lists.schedmd.com> Sent: 22 August 2025 12:17 To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com> Subject: [slurm-users] Re: [EXTERNAL] Node Health Check Program ⚠ Caution: External sender On 8/19/25 21:25, Jennings, Michael E via slurm-users wrote: > Have you by chance given the `dev` branch a try? All our production servers > currently run `lbnl-nhc-1.5-0.82.gf8dc.el8.noarch` built from the `dev` > branch, have been for some time now, and it's been rock solid. Our > RHEL-based clusters also use this version. Our HPE/Cray Shasta clusters, > including our largest (classified) clusters Crossroads, Tycho, and Venado, > use a variant. (Long story short, I've merged in all my changes into a > separate branch, but the reverse is not yet true.) This variant is, at > present, COS/SLES-specific, but it has quite a few useful additional checks > (many of them Cray-centric) contributed by other LANL folks that I haven't > had a chance to upstream yet. Due to Michael's recommendation I wanted to try out the 'dev' branch version 1.5 of NHC and build an RPM package referred to by Michael. Since I'm not a software developer, I had to figure out for myself the detailed building steps - perhaps trivial to some of you, and stumbling blocks to others. This is what I came up with: $ git clone https://github.com/mej/nhc.git $ cd nhc $ git switch dev # Switch to the 'dev' branch $ git status # Check the status $ grep nhc_version configure.ac # Verify the 'dev' version m4_define([nhc_version], [1.5]) $ ./autogen.sh # Undocumented build requirement $ cd .. $ mv nhc lbnl-nhc-1.5 # Rename the source folder $ tar czf lbnl-nhc-1.5.tar.gz lbnl-nhc-1.5 $ rpmbuild -ta lbnl-nhc-1.5.tar.gz The resulting RPM package is: ~/rpmbuild/RPMS/noarch/lbnl-nhc-1.5-0.82.gf8dc.el8.noarch.rpm I've added those steps to my Slurm Wiki page: https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.fysik.dtu.dk%2FNiflheim_system%2FSlurm_configuration%2F%23node-health-check&data=05%7C02%7Cf.otto%40ucl.ac.uk%7C8865ec39af3241be6a7908dde16ed054%7C1faf88fea9984c5b93c9210a11d9a5c2%7C0%7C0%7C638914588236979158%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=bX%2FuNDPVHjspnWZ3c%2FA4CpW61xRHCfS8OmrdDkOG0CQ%3D&reserved=0<https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_configuration/#node-health-check> Any comments? Thanks, Ole -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com