potiuk edited a comment on pull request #19857: URL: https://github.com/apache/airflow/pull/19857#issuecomment-980781255
> @potiuk You don't necessary need to write a custom secrets backend in order to have secrets rotation. E.g. if you want to use aws secrets manager's built-in secrets rotation capabilities, the existing backend [now supports it](https://github.com/apache/airflow/pull/18764). and presumably with most of the backends we have, you could implement rotation with processes external to airflow. Yeah. I know that, but one problem I have with it is that it is AWS only and you have to add quite a lot of code to support similar patterns for other secrets. I thought about something more "generic" . Maybe we should apply the approach of AWS for all other backends? Maybe we can find a similar, generic-enough solution for all of them. I think your proposal is attempt to do so, but I think (see below) it might be that it looks at the problem from a wrong angle. > is that really all that common? I have no "hard" data on that one. Just anecdotal evidence. But when I take off my "Airflow" hat and start to think as a user (in this case a person who operates organisational secrets and has to organize management. security, rotation etc.), I believe our approach is very "airflow centric" and in fact "selfish". If I were such a person who manages "credentials" for an organisation, I'd be quite pissed of at Airflow. Why? Secrets storages are "shared" resources for an organisation. IMHO we should not think about them as "general storage" which adapts to the client but they are "centerpiece" of managing identities, secrets, passwords etc. by the organisations. And often follow a rigorous structure of secrets stored there, that is decided on an managed on "organisational" level. For example it is important, I believe, that if there is a service that can be acesses by many clients, the "secret" (password/token) is kept in the secret manager only once - and it's the clients should adapt to where and how it should be retrieved from. It simplifies rotation, security, access audits, security crisis responses etc. The big problem with the current approach of Airflow is that Airflow expects the secret to be in a format that only Airflow understands. And it's the same - whether it's URI or dictionary in your proposal. No other system that needs the same password for the same external service will be able to use it (unles they specifically implement support for "airflow-formatted-secret"). This basically means that if the same token needs to be used by another (non-Airlfow) client, the organisation will have to rotate it in two places - once in the format that is required by Airflow, and once in the format that is understood by the other client. The AWS implementaiton is I think the manifestation of that issue - the reason why we have configurable fields is because we wanted to tap into "non airflow formats" of storing the passwords/token. Where I think the only "universal" assumption we can make is that "password" or "token" is stored as "single, standalone secret". Not as part of URI, not as part of dictionary of specific structure. This is one of the reasons why for example LDAP clients are so configurable - because the clients have to be able to adapt to the organisational schema chosen by the organisation, not the other way round. AWS implementation (with full_url_mode = false) now is pretty much what LDAP clients do. Looking at it from the organisation that has to maintain many systems, I think it's very "selfish" (and self-centered) from Airflow side to expect the organisation to store the "token/password" in the format that only Airflow understands. > i think i'm a bit skeptical that we should get more involved in secrets / connections management / rotation. i think it might be best to leave it to the external tools. Absolutely agree. And I am not about managing the connection. The proposal I made with `SECRET:' is probably not good (for the reason you explained below. It was really one of the ways how to retrieve the password as single "secret" and plug-in the existing Airflow approach. But the AWS approach is indeed better (though more complex implementation and more difficult to generalize to all secret backends). Maybe we can find a solution that will: * apply to all secrets (like your dictionary proposal) * will have the configurabiliy that AWS secret backend has * does not force the organisation to keep "credentials" as part of the structure that is Airflow-specific > and pulling _part_ of a connection from secrets backend and _another_ part of it from _another_ secrets backend... that could get confusing pretty quickly, particularly when considering the diversity in the structure of connections in airflow. e.g. it's not just `password` that might be "secret" and in need of rotation.. Agree. My idea is quite bad from that point of view. Though it is generic enough - it is not limited to password only - the "SECRET://" pattern could be used in any of the fields of the connection (including extras). > separately, what do you think of the general idea of this PR though? should i proceed with it you think? i think i would also like to add abilitity to load from cli using json instead of URI (i.e. putting json on same level as URI) and, ultimately i think it's best to deprecate airflow URI but that could be more controversial. I think it depends on whether you (and others) would follow my proposed line of thinking. As explained above - it still requires from the organisation to keep the structure that only Airflow understands. I think this direction is not good in general, I think we should implement a way to retrieve secrets similarly to what AWS does rather than introduce yet-another format of storing secrets, but If I am the only one who thinks this is wrong and won't convince others, then yeah - it's certainly better than URI, so I am not going to block it. > already we have some secrets backends supporting json values, and we could continue to add support on a piecemeal basis, but i figure we should just make it first class citizen and standardize The "standardize" part is precisely what worries me as we are trying to impose "airflow-standard" on something that I feel we should not (i.e. centralized secret storage). But - I might be as well wrong. Maybe it is common practice to store secrets in proprietary formats. Maybe it's simply worth to check it with a number of (big corporate) users who have their experiences with that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
