Hi,

I've attached an as of yet non-tested patch against 15.08.4 for the
same issues.

Best regards,
Daniel


On Fri, Aug 21, 2015 at 8:50 PM, Moe Jette <[email protected]> wrote:
>
> Perfect (except for some long lines that I reformatted). The commit is here:
> https://github.com/SchedMD/slurm/commit/f8490c4389e82d6b348582d726683b04d457330f
>
>
> Thanks!
>
> Quoting Daniel Ahlin <[email protected]>:
>
>> Hi again,
>>
>> [dah@petrel src]$ diff -u slurm-14.11.8/src/slurmd/slurmstepd/req.c{~,}
>> --- slurm-14.11.8/src/slurmd/slurmstepd/req.c~  2015-07-08
>> 00:19:49.000000000 +0200
>> +++ slurm-14.11.8/src/slurmd/slurmstepd/req.c   2015-08-21
>> 09:58:53.853187174 +0200
>> @@ -409,7 +409,7 @@
>>                 free_buf(buffer);
>>                 goto fail;
>>         }
>> -       rc = g_slurm_auth_verify(auth_cred, NULL, 2, NULL);
>> +       rc = g_slurm_auth_verify(auth_cred, NULL, 2,
>> slurm_get_auth_info());
>>         if (rc != SLURM_SUCCESS) {
>>                 error("Verifying authentication credential: %s",
>>                       g_slurm_auth_errstr(g_slurm_auth_errno(auth_cred)));
>>
>> is needed as well, this has now been somewhat tested.
>>
>> However inspecting function calls I would assume the following is a
>> more complete patch, but I've not tested this:
>>
>> --- slurm-14.11.8/src/slurmd/slurmstepd/req.c~  2015-07-08
>> 00:19:49.000000000 +0200
>> +++ slurm-14.11.8/src/slurmd/slurmstepd/req.c   2015-08-21
>> 10:07:25.766802443 +0200
>> @@ -409,7 +409,7 @@
>>                 free_buf(buffer);
>>                 goto fail;
>>         }
>> -       rc = g_slurm_auth_verify(auth_cred, NULL, 2, NULL);
>> +       rc = g_slurm_auth_verify(auth_cred, NULL, 2,
>> slurm_get_auth_info());
>>         if (rc != SLURM_SUCCESS) {
>>                 error("Verifying authentication credential: %s",
>>                       g_slurm_auth_errstr(g_slurm_auth_errno(auth_cred)));
>> @@ -419,8 +419,8 @@
>>         }
>>
>>         /* Get the uid & gid from the credential, then destroy it. */
>> -       uid = g_slurm_auth_get_uid(auth_cred, NULL);
>> -       gid = g_slurm_auth_get_gid(auth_cred, NULL);
>> +       uid = g_slurm_auth_get_uid(auth_cred, slurm_get_auth_info());
>> +       gid = g_slurm_auth_get_gid(auth_cred, slurm_get_auth_info());
>>         debug3("  Identity: uid=%d, gid=%d", uid, gid);
>>         g_slurm_auth_destroy(auth_cred);
>>         free_buf(buffer);
>>
>> Then continuing on - src/common/slurm_auth.h shows that the following
>> functions are intended to take the auth_info argument:
>> extern void *   g_slurm_auth_create( void *hosts, int timeout, char
>> *auth_info );
>>  extern int      g_slurm_auth_verify( void *cred, void *hosts, int
>> timeout, char *auth_info );
>> extern uid_t    g_slurm_auth_get_uid( void *cred, char *auth_info );
>> extern gid_t    g_slurm_auth_get_gid( void *cred, char *auth_info );
>>
>> g_slurm_auth_create and g_slurm_auth_verify seems to be OK now - but
>> g_slurm_auth_get_uid and g_slurm_auth_get_gid are not (most cases will
>> work anyway since the munge auth plugin will only use auth_info if the
>> cred has not yet been verified - and in many instances it has - still
>> I would assume passing the information to be the safe thing to do).
>> I've attached two blind patches that just replaces null with
>> slurm_get_auth_info() for these two functions.
>>
>> However, I have not tested these two patches and probably each
>> replacement should be checked by someone who knows more of the
>> surrounding context.
>>
>> Best regards,
>> Daniel
>>
>>
>> On Fri, Aug 14, 2015 at 12:57 AM, Moe Jette <[email protected]> wrote:
>>>
>>>
>>> Hi Daniel,
>>>
>>> You seem to have found two places where the AuthInfo configuration
>>> parameter
>>> was not used. I've committed your patch here:
>>>
>>> https://github.com/SchedMD/slurm/commit/02c96859e0d1df5f98b9422a64059820cc7035c7
>>>
>>> Thanks!
>>>
>>>
>>> Quoting Daniel Ahlin <[email protected]>:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> We are using a non-default AuthInfo configuration and based on
>>>> log-messages we see I believe this is not properly handled in certain
>>>> parts of the code.
>>>>
>>>> Typical log message:
>>>> Aug 12 17:06:15 t02n20 slurmd[27001]: error: Munge encode failed:
>>>> Failed to access "/var/run/munge/munge.socket.2": No such file or
>>>> directory
>>>> Aug 12 17:06:15 t02n20 slurmd[27001]: error: Creating authentication
>>>> credential: Socket communication error
>>>> Aug 12 17:06:15 t02n20 slurmd[27001]: error: stepd_connect to 3165.0
>>>> failed: Protocol authentication error
>>>> Aug 12 17:06:15 t02n20 slurmd[27001]: error: If munged is up, restart
>>>> with --num-threads=10
>>>>
>>>> Below is two untested fixes for this. It may be some time before we
>>>> can deploy this so I post them anyway for comments and possible use to
>>>> other sites.
>>>>
>>>> diff -u slurm-14.11.8/src/common/stepd_api.c~
>>>> slurm-14.11.8/src/common/stepd_api.c
>>>> --- slurm-14.11.8/src/common/stepd_api.c~       2015-07-08
>>>> 00:19:49.000000000 +0200
>>>> +++ slurm-14.11.8/src/common/stepd_api.c        2015-08-13
>>>> 07:31:32.330700484 +0200
>>>> @@ -238,7 +238,7 @@
>>>>
>>>>         buffer = init_buf(0);
>>>>         /* Create an auth credential */
>>>> -       auth_cred = g_slurm_auth_create(NULL, 2, NULL);
>>>> +       auth_cred = g_slurm_auth_create(NULL, 2, slurm_get_auth_info());
>>>>         if (auth_cred == NULL) {
>>>>                 error("Creating authentication credential: %s",
>>>>                       g_slurm_auth_errstr(g_slurm_auth_errno(NULL)));
>>>>
>>>> I believe the same error to be present in:
>>>>
>>>> diff -u slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c~
>>>> slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c---
>>>> slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c~ 2015-07-08
>>>> 00:19:49.000000000 +0200
>>>> +++ slurm-14.11.8/src/plugins/mpi/pmi2/spawn.c  2015-08-13
>>>> 07:34:41.204029110 +0200
>>>> @@ -154,7 +154,7 @@
>>>>         spawn_subcmd_t *subcmd;
>>>>         void *auth_cred;
>>>>
>>>> -       auth_cred = g_slurm_auth_create(NULL, 2, NULL);
>>>> +       auth_cred = g_slurm_auth_create(NULL, 2, slurm_get_auth_info());
>>>>         if (auth_cred == NULL) {
>>>>                 error("authentication: %s",
>>>>                       g_slurm_auth_errstr(g_slurm_auth_errno(NULL)) );
>>>>
>>>> Best regards,
>>>> Daniel Ahlin
>>>> PDC, KTH
>>>
>>>
>>>
>>>
>>> --
>>> Morris "Moe" Jette
>>> CTO, SchedMD LLC
>>> Commercial Slurm Development and Support
>>> ===============================================================
>>> Slurm User Group Meeting, 15-16 September 2015, Washington D.C.
>>> http://slurm.schedmd.com/slurm_ug_agenda.html
>
>
>
> --
> Morris "Moe" Jette
> CTO, SchedMD LLC
> Commercial Slurm Development and Support
> ===============================================================
> Slurm User Group Meeting, 15-16 September 2015, Washington D.C.
> http://slurm.schedmd.com/slurm_ug_agenda.html

Attachment: g_slurm_auth_x-should-use-slurm_get_auth_info.patch.gz
Description: GNU Zip compressed data

Reply via email to