Hi,
We are using a homegrown solution for ticket propagation to jobs (we
mainly use it for AFS but have hopes of also using it for Lustre in a
not too distant future). We have run it in production for about 9
months now and in various test-systems for perhaps two years. I have
no real substantial argument as to why we did not use AUKS so for all
I know that is probably the place to start.
If you are anyway interested in our setup - this is a rough description:
Components:
* Ticket-forwarding server/client (which we call in.kdepositd and
kdeposit respectively, these are slight modifications of the kf/kfd
tools in the heimdal kerberos implementation). The server part runs on
the same node as slurmctld.
* spank-module called spank_krb5_propagate.so which handles various
things in srun/sbatch/salloc (depositing the current TGTs on the
scheduler node using kdeposit) and in slurmd (taking the TGT deposited
on the allocated node by slurmctld (see below) and attaching it to
sessions).
* job prolog called spank_krb5_propagator.sh which is run after node
selection to distribute the ticket deposited on the scheduler node to
the nodes allocated to the job.
Functional description:
When the the user runs sbatch/salloc (or srun outside of jobs) a copy
of the users TGT (which is expected to be available in the
environment) is deposited on the scheduler node using kdeposit (which
is exec'ed in the spank module). Simultaneously a job environment
variable is set ("KRB5_TAG") which identifies the ticket deposited for
this job (I would have preferred to use job-id's here but these seem
not to be available in the spank-environment when it is called).
When nodes are allocated to the job spank_krb5_propagator.sh is run
through PrologSlurmctld. The script copies the deposited key to the
nodes (currently through kerberized ssh - but I would prefer to use a
sbcast that could run as root outside of jobs were it available).
When tasks are started through slurmd on the nodes another part of
krb5_propagate is run which:
1. Copies the copied tgt to a new ticket-cache specific for each task.
2. Sets KRB5CCNAME to point to the new cache
3. If AFS is available on the machine - create a new PAG an runs
afslog (after this the job is started in the same exec-chain).
A couple of outstanding items that should be improved:
* Arrange so that a pam-module makes the ticket cache available for
users coming in to the allocated node through ssh. This is probably
not so hard - but we don't want the users to overwrite the
job-ticket-cache with tickets they may proagate through ssh's
propagation mechanism.
* Prepopulate the copied cache with service-tickets likely to be used
(to alleviate load on the KDCs) examples could be, afs-related
tickets, host-tickets for the other nodes belonging to the job,
nfs-related tickets etc...
* Possibly - _if_ the scheduler is allowed to access, directly or
indirectly, the keytabs of all services/hosts, then the scheduler can
pre-populate the ticket-cache with all necessary service-tickets at
job-launch - which would limit the dependency and load on the KDC. A
nice variant of this is that the users TGT would not be needed on the
compute-nodes and would then be safe from possible theft.
* Ticket curation (continuous renewal of active and removal of
inactive tickets). This is quite simple - but we have not yet needed
to do it.
If this is of general interest we will of course share the source of
the various parts.
Best regards,
Daniel Ahlin
PDC, KTH
On Wed, May 25, 2016 at 1:21 PM, Mike Johnson
<[email protected]> wrote:
>
> Hi all,
>
> I know this is a long-standing question, but thought it was worth
> asking. I am in an environment that uses NFSv4, which obviously needs
> user credentials to grant access to filesystems. Has anyone else
> tackled the issue of unattended batch jobs successfully? I'm aware of
> AUKS. Is there any other method anyone has used?
>
> I'd be receptive to trying something like GlusterFS if it provided
> similar authentication and encryption measures.
>
> Thanks
> Mike