vincbeck commented on PR #56433: URL: https://github.com/apache/airflow/pull/56433#issuecomment-3406314370
> The only option I see is to consider changing the Keycloak model so that each DAG (or DAG + team pair) becomes an explicit UMA resource in Keycloak. This would allow the `KeycloakAuthManager` to use python-keycloak’s `uma_permissions()` (or the REST equivalent) to retrieve all DAG#GET permissions for a user in a single request, instead of evaluating each DAG individually. That way, `batch_is_authorized_dag()` could be implemented efficiently and remove the current per-DAG loop that sends one UMA request per DAG. > > The main benefit of this approach is that it would enable true batching, with a single UMA call returning all DAGs the user can access, which would greatly reduce latency and load on Keycloak in environments with many DAGs. It would also simplify Airflow’s authorization logic by replacing multiple `is_authorized_dag()` evaluations with one aggregated call, aligning with how other AuthManagers handle batch authorization. > > The trade-off, however, is that this model introduces full lifecycle management of DAG resources within Airflow. We would need to create, update, and delete corresponding Keycloak resources whenever DAGs are added, renamed, or removed. That requires maintaining an admin client in Airflow with credentials to perform those operations and ensuring the synchronization stays consistent. DAG renames and ownership changes would also need careful handling to avoid stale or orphaned resources. In short, modeling each DAG as a real Keycloak resource would make DAG#GET checks efficient and batchable, but it comes at the cost of adding a fair amount of lifecycle and operational complexity on our side. > > Is it something worth to explore? Yeah I think you summarized it pretty well. Having all Dags and other resources (e.g. connection, variable, ...) declared as first class citizen resource in Keycloak was actually the approach I took initially but because of the reason you mention (having to have constantly all resources defined in Airflow, defined in Keycloak as well) I changed the implementation. To be honest, I still do not know today what is the best/worse solution. I do not like either of them today. But to me, having to sync constantly all resources from Airflow to Keycloak is a BIG pain point. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
