Nancy,
Here’s the information you requested:
slurm.conf
AcctGatherProfileType=acct_gather_profile/hdf5
AcctGatherFilesystemType=acct_gather_filesystem/lustre
acct_gather.conf
ProfileHDF5Dir=/curc/slurm/slurm/acct
ProfileHDF5Default=Filesystem
Dan Milroy
From: Nancy Kritkausky [mailto:[email protected]]
Sent: Wednesday, May 28, 2014 11:32 AM
To: slurm-dev
Subject: [slurm-dev] Re: HDF5 Profile Plugin setup
Hello Daniel,
The syntax that is reported in the error message actually looks okay. Can you
provide the part of your slurm.conf that defines the acctGatherProfileType and
your acct_gather.conf? Maybe we can try and re-create the problem,
Thanks,
Nancy
From: Daniel Milroy [mailto:[email protected]]
Sent: Wednesday, May 28, 2014 10:08
To: slurm-dev
Subject: [slurm-dev] Re: HDF5 Profile Plugin setup
Hi Nancy and Rod,
I believe that slurm was built properly on the runtime system. Slurm was
configured with the --with-hdf5=yes option, and config.log indicates that the
hdf5 libs were found:
configure:20180: checking hdf5.h usability
configure:20180: gcc -c -g -O2 -pthread -fno-gcse -I/include conftest.c >&5
configure:20180: $? = 0
configure:20180: result: yes
configure:20180: checking hdf5.h presence
configure:20180: gcc -E -I/include conftest.c
configure:20180: $? = 0
configure:20180: result: yes
configure:20180: checking for hdf5.h
configure:20180: result: yes
configure:20188: checking for H5Fcreate in -lhdf5
configure:20213: gcc -o conftest -g -O2 -pthread -fno-gcse -I/include
-L/usr/lib64 conftest.c -lhdf5 -lm -lz -lhdf5 >&5
configure:20213: $? = 0
configure:20222: result: yes
configure:20234: checking for main in -lhdf5_hl
configure:20253: gcc -o conftest -g -O2 -pthread -fno-gcse -I/include
-L/usr/lib64 conftest.c -lhdf5_hl -lm -lz -lhdf5 >&5
configure:20253: $? = 0
configure:20262: result: yes
configure:20274: checking for matching HDF5 Fortran wrapper
configure:20278: result: /usr/bin/h5fc
The required shared object is in
/curc/slurm/slurm/14.03.3/lib/slurm/acct_gather_profile_hdf5.so.
Thank you,
Dan Milroy
From: Nancy Kritkausky [mailto:[email protected]]
Sent: Wednesday, May 28, 2014 10:55 AM
To: slurm-dev
Subject: [slurm-dev] Re: HDF5 Profile Plugin setup
Dan,
You can check your installation to make sure the library is there. The name of
the library is acct_gather_profile_hdf5.so. It is normally installed under
/usr/lib64/slurm. But depending on your .configure is could be elsewhere,
including /usr/share. As Rod said, if hdf5 is not installed, it will not be
built.
Hope this helps too,
Nancy
From: Rod Schultz [mailto:[email protected]]
Sent: Wednesday, May 28, 2014 09:23
To: slurm-dev
Subject: [slurm-dev] Re: HDF5 Profile Plugin setup
Dan,
Do you have HDF5 installed on your system? Both the runtime system and the
system upon which you built slurm.
At configure time, there is a dependency on hdf5 being installed.
The first couple of error appear to be caused by not finding the library. This
is probably the result of a build problem.
The last few are continued parsing of account_gather.conf.
The parsing of this file involves calling parsers in each sub-account-gather
plugin. If the plugin isn’t installed, items in the file are considered errors.
Rod
From: Daniel Milroy [mailto:[email protected]]
Sent: Wednesday, May 28, 2014 8:37 AM
To: slurm-dev
Subject: [slurm-dev] Re: HDF5 Profile Plugin setup
Hi Danny,
There wasn’t anything in the “Profiling Using HDF5 User Guide” that indicated
that I should load the plugin via spank. It was a result of research into
enabling the plugin since various combinations of the parameters weren’t
working.
Removing the reference to the lustre acct_gather shared object in
plugstack.conf and restarting the service yields:
error: Couldn't find the specified plugin name for acct_gather_profile/hdf5
looking at all files
error: cannot find acct_gather_profile plugin for acct_gather_profile/hdf5
fatal: ProfileHDF5Default can not be set to NotSet, please specify a valid
option
error: Parsing error at unrecognized key: ProfileHDF5Dir
error: Parse error in file /curc/slurm/slurm/etc/acct_gather.conf line 1:
"ProfileHDF5Dir=/curc/slurm/slurm/acct"
error: Parsing error at unrecognized key: ProfileHDF5Default
error: Parse error in file /curc/slurm/slurm/etc/acct_gather.conf line 2:
"ProfileHDF5Default=Filesystem"
Regards,
Dan Milroy
From: Danny Auble [mailto:[email protected]]
Sent: Tuesday, May 27, 2014 12:29 PM
To: slurm-dev
Subject: [slurm-dev] Re: HDF5 Profile Plugin setup
Dan, I wouldn't expect spank would be needed to load this plugin.
Try taking the line out of your plugstack.conf and see if that works for you.
Was there something in the documentation
(http://slurm.schedmd.com/hdf5_profile_user_guide.html) that lead you down this
path?
Danny
On 05/23/2014 03:59 PM, Daniel Milroy wrote:
Hello,
I’ve been experiencing difficulties enabling the AcctGatherProfileType/hdf5
plugin for the Lustre filesystem. So far I’ve set the following parameters:
slurm.conf
AcctGatherProfileType=acct_gather_profile/hdf5
AcctGatherFilesystemType=acct_gather_filesystem/lustre
acct_gather.conf
ProfileHDF5Dir=/curc/slurm/slurm/acct
ProfileHDF5Default=Filesystem
plugstack.conf
required
/curc/slurm/slurm/current/lib/slurm/acct_gather_filesystem_lustre.so
Upon job submission, I receive the following error:
salloc: error: spank:
"/curc/slurm/slurm/14.03.3/lib/slurm/acct_gather_filesystem_lustre.so" exports
0 symbols
salloc: error: spank: /curc/slurm/slurm/etc/plugstack.conf:2: Failed to load
plugin /curc/slurm/slurm/14.03.3/lib/slurm/acct_gather_filesystem_lustre.so.
Aborting.
salloc: error: Failed to initialize plugin stack
Please let me know what I can do to properly enable this plugin.
Regards,
Dan Milroy