Tanya and I have been evaluating authentication use cases against available
solutions for Pulp 3. This email includes a summary of the goals we
identified, based input from numerous stakeholders, and a working proposal
for a set of technologies we can use to achieve them. Please provide
feedback, ask questions, and suggest alternatives if you know of something
that may be better. I hope to turn this thread into a set of redmine tasks
and stories soon.
Please note that this covers authentication only. Authorization is a
related, but distinct, problem that we think can be solved separately.
Planning thus far has been tracked here:
- As a user, I can authenticate to the REST API when pulp is the authn
- As a user, I can authenticate to the REST API using an external authority
such as FreeIPA or AD.
- Participate in the django auth ecosystem so we don't have to reinvent
things, and so we can integrate well with django add-ons.
- The REST API does not and should not have sessions. Any persistent token
should be used for auth only.
- Authenticating to external authorities is hard. To the extent that it's
reasonable to do so, leverage other tools for this.
The working proposal is to use djangorestframework-jwt for token-based
authentication to the REST API, and leverage a set of apache modules to
handle authentication and retrieval of user attributes from an external
Pulp 2 supports client SSL certificates as an option for authentication,
and as the only option when using the "login" feature. Client SSL certs are
a well-known and tested standard that is robust. However, their complexity
has caused continued friction in the user experience. Situations such as an
expired certificate, changing the CA on the server, or restricted
filesystem access to a certificate are difficult to diagnose in large part
because SSL libraries do a poor job of error reporting. Connection
negotiation fails, and it can be unclear why.
>From a development perspective, working with client ssl certificates can be
challenging, such as the requirement from many libraries to provide a path
to a certificate on disk.
As such, while client SSL certificates would be a viable solution for pulp
3, a token-based authentication approach would be simpler and more in-line
with how other APIs handle authentication.
Token authentication may be marginally less secure than client SSL
certificates, since the entire token must be sent with every request.
However, in order for that to be compromised (assuming https is in use), a
third party would need the ability to eavesdrop and decrypt the ssl traffic.
Django and DRF (django rest framework) provide basic auth support out of
the box, including password management. This can be enabled for the entire
REST API and any other views we want.
Most modern network-based APIs use some variety of token authentication. A
token is a string obtained after proving identity (through basic auth or
some other means), which is used to prove identity for future requests by
including it in an Authorization header.
Authenticating against an external authority can be an expensive operation,
particularly in terms of latency and load on the external service. Using
tokens allows the external authn to take place only once per user within
some period of time.
Tokens normally do not require server-side state. (There are exceptions,
such as DRF's own token support, which stores a random string in the DB
just because it's convenient.) This reduces database dependence and use,
which has slight performance benefits and would allow services to respond
to requests even when the database is unavailable. It also allows trust to
easily be delegated among services.
One advantage that client SSL certs have over tokens is that they can be
explicitly revoked (although doing so may not be easy). Tokens can be
indirectly revoked if they contain an "issued" timestamp. For example, it's
normal for a token-based authn system to reject a token that was issued
before the user's most recent password reset. That said, if tokens are only
being used for authn, user access can and should still be enforced by
authz, so that is another mechanism for turning off a user's access.
DRF includes built-in token support, but it stores tokens in the database.
This adds the characteristics of a session identifier, which is not
desirable for a stateless REST API.
JSON Web Tokens are a popular open standard (RFC 7519) commonly used for
API authentication. They are widely supported by many libraries in many
languages, simple to use, small, and can include arbitrary data as is
useful to the issuing application. Validity is verified by signature and an
optional expiration time.
djangorestframework-jwt is a recommended library that works out of the box,
and it is our current proposal for use with Pulp 3. I'll refer to it as
"drf-jwt" for brevity.
How to Get a Token
drf-jwt comes with a view that issues a token after successful basic auth,
integrating with django's user ecosystem. There is also a view for renewing
Given that pulp needs to authenticate also to external sources, the view
used to obtain a token needs to be extended for that support. It's a simple
integration point, but before diving more into drf-jwt, let's look at that
The FreeIPA project has a comprehensive guide that is worth reading if you
are interested in the topic:
They advise that utilizing any of several apache modules for external auth
is a best practice, combined with mod_lookup_identity to pass user
attributes to a web application. The primary advantage is that this
offloads authentication work, which can be very complex, to a separate
project that specializes in it.
One downside is that it does potentially limit deployment to apache httpd.
But there is work in-progress to make similar modules available for nginx.
Presumably users who do not require external auth could use a non-apache
web server. Another option is that there are some plugins available for
django that support specific types of external authn without requiring
apache modules, such as django-auth-ldap.
mod_lookup_identity is the recommended solution by the FreeIPA project for
allowing a web app to discover user identity and related attributes from a
trusted authentication source. It uses SSSD to lookup attributes, and then
it sets various REMOTE_USER_* environment variables within the context of a
request. Any web application can then trust those values, making it a
simple integration point.
Commonly-available attributes include username, email, first name, last
name, and group membership.
Auto-Creation of Users
In addition to looking for and trusting the REMOTE_USER environment
variable, the drf-jwt token creation view would be extended to
automatically create and update users. When invoked, it would:
- trust the REMOTE_USER value for authentication
- if the user exists in the DB, update its attributes with the other
- if the user does not exist in the DB, create it based on the
Group membership is an interesting aspect. One compelling approach we saw
is to auto-create groups and prefix their names with something like "ext:".
A user in the "ops" group in their enterprise directory would get put in an
auto-generated pulp group called "ext:ops".
Authz will merit its own planning, but presumably we will include the
ability to authorize based on group membership. This would provide a nice
integration story where a new employee could be added to a group by HR in
their enterprise directory, and they would automatically get the
corresponding permissions in pulp.
The working proposal is to use djangorestframework-jwt, and extend it to
trust the REMOTE_USER_* environment variables when creating tokens.
External authentication would be done by one of several apache modules,
thus requiring no explicit support in pulp.
Please ask questions and provide feedback.
A big thank-you goes to Tanya, who kept this investigation going and did a
ton of the analysis. Also thank you to the many users who provided input,
which we will continue to apply while planning authz.
Pulp-dev mailing list