vincbeck commented on PR #56433:
URL: https://github.com/apache/airflow/pull/56433#issuecomment-3406314370

   > The only option I see is to consider changing the Keycloak model so that 
each DAG (or DAG + team pair) becomes an explicit UMA resource in Keycloak. 
This would allow the `KeycloakAuthManager` to use python-keycloak’s 
`uma_permissions()` (or the REST equivalent) to retrieve all DAG#GET 
permissions for a user in a single request, instead of evaluating each DAG 
individually. That way, `batch_is_authorized_dag()` could be implemented 
efficiently and remove the current per-DAG loop that sends one UMA request per 
DAG.
   > 
   > The main benefit of this approach is that it would enable true batching, 
with a single UMA call returning all DAGs the user can access, which would 
greatly reduce latency and load on Keycloak in environments with many DAGs. It 
would also simplify Airflow’s authorization logic by replacing multiple 
`is_authorized_dag()` evaluations with one aggregated call, aligning with how 
other AuthManagers handle batch authorization.
   > 
   > The trade-off, however, is that this model introduces full lifecycle 
management of DAG resources within Airflow. We would need to create, update, 
and delete corresponding Keycloak resources whenever DAGs are added, renamed, 
or removed. That requires maintaining an admin client in Airflow with 
credentials to perform those operations and ensuring the synchronization stays 
consistent. DAG renames and ownership changes would also need careful handling 
to avoid stale or orphaned resources. In short, modeling each DAG as a real 
Keycloak resource would make DAG#GET checks efficient and batchable, but it 
comes at the cost of adding a fair amount of lifecycle and operational 
complexity on our side.
   > 
   > Is it something worth to explore?
   
   Yeah I think you summarized it pretty well. Having all Dags and other 
resources (e.g. connection, variable, ...) declared as first class citizen 
resource in Keycloak was actually the approach I took initially but because of 
the reason you mention (having to have constantly all resources defined in 
Airflow, defined in Keycloak as well) I changed the implementation.
   
   To be honest, I still do not know today what is the best/worse solution. I 
do not like either of them today. But to me, having to sync constantly all 
resources from Airflow to Keycloak is a BIG pain point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to