Hi everyone,

I would like to talk about auth backends.

In Airflow 2, there are multiple options for authenticating REST API calls. 
These options are called auth backends 
(https://airflow.apache.org/docs/apache-airflow-providers-fab/stable/auth-manager/api-authentication.html).
 The deployment manager configures which authentication mechanism is used for 
REST API calls. There are several options:
- session
- basic_auth
- Kerberos
- Google OpenID

For example, if the deployment manager sets `auth_backends = 
airflow.providers.fab.auth_manager.api.auth.backend.basic_auth`, then users 
must authenticate their Rest API calls using basic authentication 
(username/password). Note: Multiple auth backends can be configured (e.g. 
`auth_backends = 
airflow.providers.fab.auth_manager.api.auth.backend.basic_auth,airflow.providers.fab.auth_manager.api.auth.backend.session`).

Most (if not all) of these auth backends are provided by providers. They follow 
the same interface and must define two functions:
- `init_app`. To initialize resources if needed
- `requires_authentication`. Checks whether the authentication provided in the 
request is valid. For example, `basic_auth` extracts authentication information 
from the Flask request and checks whether the username/password provided are 
valid. If the authentication succeeds then the user is saved in session so that 
it can be used subsequently by the API itself.

In Airflow 3, all APIs (but here we will focus only on the public API) use JWT 
tokens for authentication. Every API request must include a valid JWT token to 
be authenticated. This is not an issue when using the UI since the UI manages 
JWT tokens automatically. However, what happens when users call the public API 
directly?

I see three possible options:

Option 1. Deprecate auth backends and introduce an API to generate JWT tokens. 
This approach aligns with how modern web applications handle authentication. It 
would simplify authentication in Airflow by enforcing a single strategy: 
JWT-based authentication. User flow:
- Call an API to generate a JWT token, providing authentication details such as 
a username and password
- If authentication succeeds, the API returns a JWT token
- Use this JWT token to authenticate public API calls

The API for generating JWT tokens would be provided by the auth manager. The 
Simple auth manager already supports this, and Bugra is working on adding it to 
the FAB auth manager (PR: https://github.com/apache/airflow/pull/47043).

This is personally my preferred solution but there are some caveats:
- Users will need to update their authentication methods. However, since 
Airflow 3 already introduces breaking changes, users will need to adjust their 
integrations regardless.
- Some authentication strategy would no longer be possible such as Google 
OpenID. Since authentication shifts from the auth backend to the auth manager, 
providers without an auth manager (like Google) would lose their auth backends. 
More explanations on this: because the API to create a JWT token is provided by 
the auth manager, a provider could support all the different authentication 
mechanisms provided by its auth backends through this API. Example: The FAB 
provider currently supports basic_auth and Kerberos as auth backends. In 
Airflow 3, the FAB auth manager could support both authentication mechanisms 
for generating JWT tokens.

Option 2. Update auth backends to be compatible with Airflow 3. To support 
Airflow 3, we would need to:
- Modify auth backends to use Fastapi instead of Flask
- Update the `requires_authentication` function, since it currently validates 
authentication and stores the user in a session—an approach incompatible with 
Airflow 3.
- Ensure compatibility with both Airflow 2 and Airflow 3, meaning we would have 
two implementations for each auth backend (one for AF2, one for AF3).

Pros:
- Backward compatibility: Users can continue using existing authentication 
methods across Airflow 2 and Airflow 3.

Cons:
- Increased maintenance complexity due to dual implementations.

Option 3. Support both Option 1 and Option 2. This would give users the 
flexibility to choose their preferred authentication method.

What do you think?

Vincent

Reply via email to