Re: [OpenAFS] best practice for a service to access a user AFS token? and why ruid instead of euid?

Jeffrey Altman Thu, 17 Nov 2016 22:00:09 -0800

Hi Todd,

Thanks for your note.

On 11/17/2016 2:27 PM, Todd Tannenbaum wrote:
> Hi OpenAFS gurus, I am in desperate need of your advice!
> 
> We are adding OpenAFS support to HTCondor (http://htcondor.org).
> 
> I read in the docs that the Cache Manager identifies token by either the
> user's UNIX UID or by a process authentication group (PAG).  In the
> non-PAG case, I expected the cache manager to identify the token by
> _effective_ uid.  But my testing implies that the token is identified by
> the _real_ uid.   Is there a way to change this behavior?

Your description is close but not quite correct.  Tokens are always
tracked by a Process Authentication Group.  The distinction is whether
the PAG identifier is a unique identifier independent of the user id or
is the user id.  It is an explicit security property of a PAG that one
user (including root) cannot switch a process into a PAG.  To change
this behavior would break the security guarantees that PAGs provide.

> My situation is hopefully not new or unique : I have a long-running
> service (HTCondor in this case) running as root that is implemented via
> a cooperating set of processes, and this service needs to impersonate
> different users in order read/write to the filesystem on their behalf.

The behavior of HTCondor is similar to that of IBM's Platform Load
Sharing Facility (LSF) which distributes jobs across an HPC environment
where the jobs have the need to authenticate to and access resources
over a network.  LSF supports the ability to forward Kerberos v5 ticket
granting tickets with the jobs, renew them on the individual nodes.  LSF
has the ability to create a PAG but does not have any specific knowledge
of how to obtain afs (or AuriStorFS) tokens.  Those tokens are obtained
by executing a shell script that can be configured with the job.

When jobs will require weeks or months to complete it is critical from a
security perspective to distribute Kerberos tickets that have a
short-lifetime but can be renewed for the length of the job provided
that the represented principal has not been disabled.

> Ideally I'd like this service to be as ignorant of AFS is possible.  It
> currently performs impersonation by changing out the effective UID,
> which of course works fine for the local filesystem.
>
> Meanwhile, I have another service running (a home-brewed "credential
> manager") on the machine that keeps AFS tokens refreshed for all users
> of the service by doing an "aklog" as each user.  Both HTCondor and the
> credential manager are not running in a PAG, as my understanding is a
> PAG can only hold one user token per cell.

It is correct that a cache manager can only associate one token (or
identity) per cell in each PAG.  If a PAG stored more than one identity
per cell there would be an identity selection problem.

There is nothing that prevents a user-land application from storing more
than one token per cell and setting the desired token into the cache
manager prior to the file system requests that must use it are issued.

> The problem is the HTCondor service cannot access AFS on behalf of the
> user being impersonated by simply switching effective UID; it apparently
> needs to switch over the real UID as well.  Thanks to the saved UID and
> the setresuid() syscall, it is trivial to change HTCondor to switch both
> the read and effective UID over to the user being impersonated, and
> still switch back to root afterwards.  Doing so would likely make
> everything work great with AFS.  But the problem is changing the
> real-uid is a potential security hole, as now HTCondor service processes
> could receive signals (i.e. SIGKILL!) from the unprivileged users it is
> simply trying to impersonate.   This is why OpenAFS's reliance on using
> the real UID for identifying tokens seems very broken to me; seems like
> it should be euid (or specifically on Linux clients, the file-system
> uid) like every other similar system...

The security hole that you want to avoid introducing into HTCondor is
exactly the security hole that would be introduced into every
application if PAGs followed the effective uid.

> 
> Any thoughts/advice?
> 
> One idea is I could have all the HTCondor service processes run in their
> own PAG and perform some sort of very lightweight "aklog" whenever
> impersonation is required. But I need this "aklog" to be very
> fast/lightweight; I don't want to be blocked on network communication to
> some KDC or even file I/O if I can help it.  The service would have
> access to a KRB5 credential cache for the user that has an AFS service
> principal; given that, is there some lightweight "aklog"-like in-process
> library call I can use (that avoids I/O)?  Something the HTCondor
> service could dlopen()/dlsym() (so I don't introduce a pile of
> dependencies) ?

There are four steps that need to be performed:

1. obtain and/or renew Kerberos v5 tickets

2. obtain the afs (or AuriStorFS token)

3. create the PAG for the process

4. set the token for the PAG

Step 3 needs to be performed once per process and does not involve the
network.

Step 4 needs to be performed each time the token is replaced due to
renewal.  It does not require network.

Steps 1 and 2 do require network but can be performed in the background
after the initial acquisition.

> 
> Another idea, as I only care about supporting Linux, would be to
> leverage Linux kernel keyrings.  I am thinking perhaps my credential
> manager could link the "afs_pag: _pag" key to the user keyring, and then
> the HTCondor service could link this key into its session keyring when
> impersonating.  Does anyone think that would work?  Or is there more to
> swapping PAGs in and out (i.e. besides the key on the keyring, pagsh
> seems to do some magic with groups as "/bin/id" shows some magic groups
> when in a PAG...) ?  Is the keyring-based idea crazy talk or a good idea
> to pursue if Linux is my only target?

The groups are not required on Linux.  The cache manager uses keyrings
internally to track the PAG.

The behavior you are seeking with user-land manipulation of keyrings to
store afs tokens is the one that David Howell's kafs client will provide
when it is complete.

  https://www.infradead.org/~dhowells/kafs/

> I've seen the great lengths (i.e. immense amount of code, security
> side-steps of creating their own krb4 tickets) that Samba has done to
> support AFS; I am hoping there is an easier way.

Samba had a different problem than HTCondor.  An SMB 1.1 server used to
authenticate a client by the client sending a username and password over
the wire to the server which could then be used with kinit (or
equivalent library calls) to obtain a Kerberos TGT.  In the 90s this was
more often than not Kerberos v4.

In SMB 1.2 NTLM authentication was introduced and later GSS-Kerberos v5
authentication.  Both avoided the transmission of the user's password.
Therefore, Samba had to obtain a Kerberos v5 ticket granting ticket or
an AFS token some other way.  This typically made use of an
impersonation service to acquire a TGT or Token after asserting it had
authenticated the user identity.  Many single sign-on web authentication
services utilize a similar model.

> Your suggestions greatly appreciated.

The best approach in my opinion is to follow the LSF model.

Jeffrey Altman
AuriStor Inc.

<<attachment: jaltman.vcf>>

smime.p7s
Description: S/MIME Cryptographic Signature

Re: [OpenAFS] best practice for a service to access a user AFS token? and why ruid instead of euid?

Reply via email to