Hi,

1: We currently don't renew tickets for running jobs. We allow
longer-lived initial tickets so the forwarded tickets are generally
long enough. However we will very probably implement ticket renewal
for jobs waiting in the queue. Generalized ticket renewal for running
jobs is more complicated in my view. Potentially we will implement
ticket/token renewal only for PAGs that have been spawned by the SPANK
module (since these are the ones we would be able to keep track of).


2: kdeposit is an in-house adaptation of the Heimdal kf/kfd service.
The source code is available in
/afs/pdc.kth.se/cluster/tegner/src/kdeposit

I should probably add that the spank plugin makes use of other
Kerberos tools that comes as part of Heimdal.

Best regards,
Daniel Ahlin
PDC, KTH


Den tis 11 juli 2017 16:12Glenn (Gedaliah) Wolosh <[email protected]> skrev:
>
> Thanks, this might be helpful. Two questions - 1) I don’t see how credentials 
> are renewed. Do you use another plugin for that and 2) You are calling 
> kdeposit. Is that something in-house?
>
> _______________
> Gedaliah Wolosh
> IST Academic and Research Computing Systems (ARCS)
> NJIT
> GITC 2203
> 973 596 5437
> [email protected]
>
> On Jul 11, 2017, at 2:29 AM, Daniel Ahlin <[email protected]> wrote:
>
> Hi,
>
> We are running Kerberos/AFS on some of our systems. If it is of any help feel 
> free to take a look at our implementation in 
> /afs/pdc.kth.se/cluster/tegner/src/spank_krb5_propagate/spank_krb5_propagate.c,
>  primarily lines 248-263. We are not running AUKS though, so perhaps there 
> are limited reuse possibilities.
>
> Best regards,
> Daniel
>
> On Mon, Jul 10, 2017 at 10:29 PM, Glenn (Gedaliah) Wolosh <[email protected]> 
> wrote:
>>
>> Hello;
>>
>> I’ve installed slurm 16.05 on SL 7.3 using ohpc. I also have the latest 
>> version of AUKS. I was able to hack auks so that aklog successfully runs 
>> when either obtaining or renewing a krb5 ticket.
>>
>> For example —
>> p-slogin.p-stheno.tartan.njit.edu-77 guest24>: kinit
>> Password for [email protected]:
>> p-slogin.p-stheno.tartan.njit.edu-78 guest24>: tokens
>>
>> Tokens held by the Cache Manager:
>>
>>    --End of list--
>> p-slogin.p-stheno.tartan.njit.edu-79 guest24>: auks -g
>> Auks API request succeed
>> p-slogin.p-stheno.tartan.njit.edu-80 guest24>: tokens
>>
>> Tokens held by the Cache Manager:
>>
>> User's (AFS ID 22967) tokens for [email protected] [Expires Jul 10 22:34]
>>    --End of list—
>>
>> Works just as well with auks -R loop
>>
>> I also set up a function slurm_spank_task_init() to call aklog in the auks 
>> spank plugin. Unfortunately, this does not work.
>> I get the following error —
>> p-slogin.p-stheno.tartan.njit.edu-81 guest24>: srun hostname
>> aklog: Couldn't determine realm of user:aklog: unknown RPC error 
>> (-1765328189)  while getting realm
>>
>> My guess is that in this case the user running aklog is not “guest24”
>>
>> Here is some relevant lines fro the log —
>> 2017-07-10T16:19:34.763] [78.0] debug3: Entering _handle_request
>> [2017-07-10T16:19:34.763] [78.0] debug3: Leaving  _handle_accept
>> [2017-07-10T16:19:34.773] [78.0] debug:  mpi type = (null)
>> [2017-07-10T16:19:34.773] [78.0] debug:  Using mpi/none
>> [2017-07-10T16:19:34.773] [78.0] debug:  task_p_pre_launch: 78.0, task 0
>> [2017-07-10T16:19:34.773] [78.0] spank-auks: running aklog
>> [2017-07-10T16:19:34.781] [78.0] debug2: spank: auks.so: task_init = 0
>> [2017-07-10T16:19:34.781] [78.0] debug:  [job 78] attempting to run slurm 
>> task_prolog [/opt/local/bin/TaskProlog]
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: conf setrlimit 
>> RLIMIT_CPU no change in value: 18446744073709551615
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: conf setrlimit 
>> RLIMIT_FSIZE no change in value: 18446744073709551615
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: conf setrlimit 
>> RLIMIT_DATA no change in value: 18446744073709551615
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: RLIMIT_STACK  : max:inf 
>> cur:inf req:8388608
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: conf setrlimit 
>> RLIMIT_STACK succeeded
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: conf setrlimit 
>> RLIMIT_CORE no change in value: 0
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: conf setrlimit 
>> RLIMIT_RSS no change in value: 18446744073709551615
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: conf setrlimit 
>> RLIMIT_NPROC no change in value: 4096
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: RLIMIT_NOFILE : 
>> max:51200 cur:51200 req:1024
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: conf setrlimit 
>> RLIMIT_NOFILE succeeded
>> [2017-07-10T16:19:34.813] [78.0] debug:  Couldn't find SLURM_RLIMIT_MEMLOCK 
>> in environment
>> [2017-07-10T16:19:34.813] [78.0] debug2: _set_limit: conf setrlimit 
>> RLIMIT_AS no change in value: 18446744073709551615
>> [2017-07-10T16:19:34.815] [78.0] task 0 (5305) exited with exit code 0.
>>
>> Note that the TaskProlog also calls aklog. This will get me a token using 
>> srun but will not get me a token when using sbatch.
>>
>> I also have in my slurm.conf “UsePAM=1” with the following  slurm pamfile
>>
>> auth    required        pam_localuser.so
>> account required        pam_unix.so
>> session required        pam_limits.so
>> session required        pam_afs_session.so
>>
>> This doesn’t work either.
>>
>> Any advice would be greatly appreciated.
>> _______________
>> Gedaliah Wolosh
>> IST Academic and Research Computing Systems (ARCS)
>> NJIT
>> GITC 2203
>> 973 596 5437
>> [email protected]
>>
>
>

Reply via email to