potiuk commented on PR #59643: URL: https://github.com/apache/airflow/pull/59643#issuecomment-3687632991
Actually - the best way to fix it is to do it properly - i.e. run "test_connection" from workers - not from the "api-server" component. This is not the question of only the connection password masking but also execution environment. ## A bit more context. Running "test_connection" in the api_server context has several problems: 1) the api-server does not need to connect to outside world. In many cases where security is important, the api_server will be running in the container / network /security perimeter which cannot even make such outside connections to external systems becasue it will be behind restrictive firewall. Api-server is usually **somewhat** exposed to outside - ie. it's theorethicallly easier for the attacker to break into api-server, so a number of of deployments out-there have api-server in quite well separated environment. This is a functional problem with "test connection" run from api-server 2) No User-provided code (not even DAG-authored code) should be runnnable on api-server. Running test-connection is inherently unsafe - because there are a number of connections and drivers that are impossible to secure. This is the case with ODBC for example or JDBC - where you can often pass a configuration to such driver that might cause RCE in the environment where the driver is initialized - we had MANY such security issues in the past and this was the main reason why we disabled the test-connection by default https://airflow.apache.org/docs/apache-airflow/stable/release_notes.html#disable-default-allowing-the-testing-of-connections-in-ui-api-and-cli-32052 in Airflow 2.7.0 and warned our users that they should not enable it if they care about security. This was advised to our users in this CVE https://nvd.nist.gov/vuln/detail/cve-2023-37379 - and in the configuration we told them in the blog as well: https://airflow.apache.org/blog/airflow-2.7.0/ > Airflow 2.7.0 is a release that focuses on security. The Airflow security team, working together with security researchers, identified a number of areas that required strengthening of security. This resulted in, among others things, an improved description of the [Airflow security model](https://airflow.apache.org/docs/apache-airflow/stable/security/security_model/), a better explanation of our [security policy](https://github.com/apache/airflow/security/policy) and the disabling of certain, potentially dangerous, features by default - like, for example, connection testing (#32052). This is a security problem with running "Test-connection" from the api-server. The conditions of triggering the vulnerability and dangers have not changed a bit since 2.7.0 - even with task isolation - those are exactly the same dangers that were in 2.7.0 when test-connection was disabled by default. Users are exactly in the same - dangerous - place where they enable it back today. ## What can we do about with it ? Build on generic workflow feature introduced in Deadline alerts and instead of running "Test-connection" in the api-server, schedule workflow to be executed to actually do this in "worker" or "triggerer". This is what I was sceptical about when test-connection was introduced: https://github.com/apache/airflow/pull/15795#issuecomment-839665975 and further discussions. What HAS changed since then is the argument of runing "api-server" and "workers" together does not stand any more (soon). We do everything possible now to enable the case where workers have not only different dependencies but simply - different airflow distribiutions ("airflow-task-sdk" rather than "airflow-core") - so the main argument in that discusison that running Airflow in such isolated environments is an edge case - but in Airflow 3.2 hopefully - this will become essentially "reocmmended" way of running airflow. Also we have something more thna we had 4 year ago - we finally are clearing the path of being ab le to run more generic workloads and run them through wokers/triggerer and provide feedback from those workloads (with deadline feature) - and we could follow the suite and implement it in similar way. This does not solve all the security issues but it makes our isolation way more consistent. With test connection on api_server you basically make UI users capable of running arbitrary code in api_server, and arbitrary modification of the database. If you move it's execution to workers- those components of airflow are way more isolated and are supposed to run dangerous code already, and usually is done in ephemeral (say k8s pods in k8s executor) - rather than long running environment of api_server. Which has IMHO acceptable security properties to even enable it by default with some security model description changes. My recommendation would be - not to "fix" this feature - but declare it "broken" and fix it properly by moving test-connection workflows to "workers". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
