Perfect (except for some long lines that I reformatted). The commit is here: https://github.com/SchedMD/slurm/commit/f8490c4389e82d6b348582d726683b04d457330f
Thanks! Quoting Daniel Ahlin <[email protected]>:
Hi again, [dah@petrel src]$ diff -u slurm-14.11.8/src/slurmd/slurmstepd/req.c{~,} --- slurm-14.11.8/src/slurmd/slurmstepd/req.c~ 2015-07-08 00:19:49.000000000 +0200 +++ slurm-14.11.8/src/slurmd/slurmstepd/req.c 2015-08-21 09:58:53.853187174 +0200 @@ -409,7 +409,7 @@ free_buf(buffer); goto fail; } - rc = g_slurm_auth_verify(auth_cred, NULL, 2, NULL); + rc = g_slurm_auth_verify(auth_cred, NULL, 2, slurm_get_auth_info()); if (rc != SLURM_SUCCESS) { error("Verifying authentication credential: %s", g_slurm_auth_errstr(g_slurm_auth_errno(auth_cred))); is needed as well, this has now been somewhat tested. However inspecting function calls I would assume the following is a more complete patch, but I've not tested this: --- slurm-14.11.8/src/slurmd/slurmstepd/req.c~ 2015-07-08 00:19:49.000000000 +0200 +++ slurm-14.11.8/src/slurmd/slurmstepd/req.c 2015-08-21 10:07:25.766802443 +0200 @@ -409,7 +409,7 @@ free_buf(buffer); goto fail; } - rc = g_slurm_auth_verify(auth_cred, NULL, 2, NULL); + rc = g_slurm_auth_verify(auth_cred, NULL, 2, slurm_get_auth_info()); if (rc != SLURM_SUCCESS) { error("Verifying authentication credential: %s", g_slurm_auth_errstr(g_slurm_auth_errno(auth_cred))); @@ -419,8 +419,8 @@ } /* Get the uid & gid from the credential, then destroy it. */ - uid = g_slurm_auth_get_uid(auth_cred, NULL); - gid = g_slurm_auth_get_gid(auth_cred, NULL); + uid = g_slurm_auth_get_uid(auth_cred, slurm_get_auth_info()); + gid = g_slurm_auth_get_gid(auth_cred, slurm_get_auth_info()); debug3(" Identity: uid=%d, gid=%d", uid, gid); g_slurm_auth_destroy(auth_cred); free_buf(buffer); Then continuing on - src/common/slurm_auth.h shows that the following functions are intended to take the auth_info argument: extern void * g_slurm_auth_create( void *hosts, int timeout, char *auth_info ); extern int g_slurm_auth_verify( void *cred, void *hosts, int timeout, char *auth_info ); extern uid_t g_slurm_auth_get_uid( void *cred, char *auth_info ); extern gid_t g_slurm_auth_get_gid( void *cred, char *auth_info ); g_slurm_auth_create and g_slurm_auth_verify seems to be OK now - but g_slurm_auth_get_uid and g_slurm_auth_get_gid are not (most cases will work anyway since the munge auth plugin will only use auth_info if the cred has not yet been verified - and in many instances it has - still I would assume passing the information to be the safe thing to do). I've attached two blind patches that just replaces null with slurm_get_auth_info() for these two functions. However, I have not tested these two patches and probably each replacement should be checked by someone who knows more of the surrounding context. Best regards, Daniel On Fri, Aug 14, 2015 at 12:57 AM, Moe Jette <[email protected]> wrote:Hi Daniel, You seem to have found two places where the AuthInfo configuration parameter was not used. I've committed your patch here: https://github.com/SchedMD/slurm/commit/02c96859e0d1df5f98b9422a64059820cc7035c7 Thanks! Quoting Daniel Ahlin <[email protected]>:Hi, We are using a non-default AuthInfo configuration and based on log-messages we see I believe this is not properly handled in certain parts of the code. Typical log message: Aug 12 17:06:15 t02n20 slurmd[27001]: error: Munge encode failed: Failed to access "/var/run/munge/munge.socket.2": No such file or directory Aug 12 17:06:15 t02n20 slurmd[27001]: error: Creating authentication credential: Socket communication error Aug 12 17:06:15 t02n20 slurmd[27001]: error: stepd_connect to 3165.0 failed: Protocol authentication error Aug 12 17:06:15 t02n20 slurmd[27001]: error: If munged is up, restart with --num-threads=10 Below is two untested fixes for this. It may be some time before we can deploy this so I post them anyway for comments and possible use to other sites. diff -u slurm-14.11.8/src/common/stepd_api.c~ slurm-14.11.8/src/common/stepd_api.c --- slurm-14.11.8/src/common/stepd_api.c~ 2015-07-08 00:19:49.000000000 +0200 +++ slurm-14.11.8/src/common/stepd_api.c 2015-08-13 07:31:32.330700484 +0200 @@ -238,7 +238,7 @@ buffer = init_buf(0); /* Create an auth credential */ - auth_cred = g_slurm_auth_create(NULL, 2, NULL); + auth_cred = g_slurm_auth_create(NULL, 2, slurm_get_auth_info()); if (auth_cred == NULL) { error("Creating authentication credential: %s", g_slurm_auth_errstr(g_slurm_auth_errno(NULL))); I believe the same error to be present in: diff -u slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c~ slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c--- slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c~ 2015-07-08 00:19:49.000000000 +0200 +++ slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c 2015-08-13 07:34:41.204029110 +0200 @@ -154,7 +154,7 @@ spawn_subcmd_t *subcmd; void *auth_cred; - auth_cred = g_slurm_auth_create(NULL, 2, NULL); + auth_cred = g_slurm_auth_create(NULL, 2, slurm_get_auth_info()); if (auth_cred == NULL) { error("authentication: %s", g_slurm_auth_errstr(g_slurm_auth_errno(NULL)) ); Best regards, Daniel Ahlin PDC, KTH-- Morris "Moe" Jette CTO, SchedMD LLC Commercial Slurm Development and Support =============================================================== Slurm User Group Meeting, 15-16 September 2015, Washington D.C. http://slurm.schedmd.com/slurm_ug_agenda.html
-- Morris "Moe" Jette CTO, SchedMD LLC Commercial Slurm Development and Support =============================================================== Slurm User Group Meeting, 15-16 September 2015, Washington D.C. http://slurm.schedmd.com/slurm_ug_agenda.html
