Hi,

I have run into a problem using the slurmdb API and I'd like to bring it to your attention. This problem came up in the context of porting our pipeline manager, which uses DRMAA, to slurm and trying to enhance the slurm-drmaa library (http://apps.man.poznan.pl/trac/slurm-drmaa) so that drmaa_job_ps() call will correctly return the status of completed jobs as well as jobs that are running or queued.

I'm currently using slurm 14.11.6, which I know is old, but it is the version that is installed and running on our cluster (and I don't control which version is in use).
I do not know if this problem also exists in newer versions of slurm.

In a nutshell, I believe that the problem is that the code for common/slurm_accounting_storage.c, which contains slurm_acct_storage_init, appears to be included in _both_ libslurm.so and libslurmdb.so, so there are two copies of the static data, one in each shared library. However the code for db_api/connection_functions.c, which contains slurmdb_connection_get, is compiled only into libslurmdb.so.

> nm /usr/cluster/lib/libslurm.so.28.0.0 | grep slurm_acct_storage_init
00000000000a690c T slurm_acct_storage_init
> nm /usr/cluster/lib/libslurmdb.so.28.0.0 | grep slurm_acct_storage_init
00000000000ab240 T slurm_acct_storage_init
> nm /usr/cluster/lib/libslurm.so.28.0.0 | grep slurmdb_connection_get
> nm /usr/cluster/lib/libslurmdb.so.28.0.0 | grep slurmdb_connection_get
000000000001d000 T slurmdb_connection_get

When I try to write code like the following:

    slurm_acct_storage_init(NULL);
    conn = slurmdb_connection_get();

and link with -lslurm -lslurmdb it core dumps because slurm_acct_storage_init is initializing the static data in slurm.so, but not in slurmdb.so. I have been able to work around this by dynamically calling the second copy of slurm_acct_storage_init (through dlopen/dlsym), but it seems like a bug that the static data (and the code) is built into both of these shared libraries.

Do you know if this has been fixed in more recent versions of slurm?
Or am I trying to use the API in a way that is not intended?
Should I file a bug report?

Thanks very much,
Bob Handsaker

Reply via email to