Hi Daniel,

You seem to have found two places where the AuthInfo configuration parameter was not used. I've committed your patch here:
https://github.com/SchedMD/slurm/commit/02c96859e0d1df5f98b9422a64059820cc7035c7

Thanks!

Quoting Daniel Ahlin <[email protected]>:
Hi,

We are using a non-default AuthInfo configuration and based on
log-messages we see I believe this is not properly handled in certain
parts of the code.

Typical log message:
Aug 12 17:06:15 t02n20 slurmd[27001]: error: Munge encode failed:
Failed to access "/var/run/munge/munge.socket.2": No such file or
directory
Aug 12 17:06:15 t02n20 slurmd[27001]: error: Creating authentication
credential: Socket communication error
Aug 12 17:06:15 t02n20 slurmd[27001]: error: stepd_connect to 3165.0
failed: Protocol authentication error
Aug 12 17:06:15 t02n20 slurmd[27001]: error: If munged is up, restart
with --num-threads=10

Below is two untested fixes for this. It may be some time before we
can deploy this so I post them anyway for comments and possible use to
other sites.

diff -u slurm-14.11.8/src/common/stepd_api.c~
slurm-14.11.8/src/common/stepd_api.c
--- slurm-14.11.8/src/common/stepd_api.c~       2015-07-08
00:19:49.000000000 +0200
+++ slurm-14.11.8/src/common/stepd_api.c        2015-08-13
07:31:32.330700484 +0200
@@ -238,7 +238,7 @@

        buffer = init_buf(0);
        /* Create an auth credential */
-       auth_cred = g_slurm_auth_create(NULL, 2, NULL);
+       auth_cred = g_slurm_auth_create(NULL, 2, slurm_get_auth_info());
        if (auth_cred == NULL) {
                error("Creating authentication credential: %s",
                      g_slurm_auth_errstr(g_slurm_auth_errno(NULL)));

I believe the same error to be present in:

diff -u slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c~
slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c---
slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c~ 2015-07-08
00:19:49.000000000 +0200
+++ slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c  2015-08-13
07:34:41.204029110 +0200
@@ -154,7 +154,7 @@
        spawn_subcmd_t *subcmd;
        void *auth_cred;

-       auth_cred = g_slurm_auth_create(NULL, 2, NULL);
+       auth_cred = g_slurm_auth_create(NULL, 2, slurm_get_auth_info());
        if (auth_cred == NULL) {
                error("authentication: %s",
                      g_slurm_auth_errstr(g_slurm_auth_errno(NULL)) );

Best regards,
Daniel Ahlin
PDC, KTH


--
Morris "Moe" Jette
CTO, SchedMD LLC
Commercial Slurm Development and Support
===============================================================
Slurm User Group Meeting, 15-16 September 2015, Washington D.C.
http://slurm.schedmd.com/slurm_ug_agenda.html

Reply via email to