potiuk commented on PR #59643:
URL: https://github.com/apache/airflow/pull/59643#issuecomment-3687632991

   Actually - the best way to fix it is to do it properly - i.e. run 
"test_connection" from workers  - not from the "api-server" component.  This is 
not the question of only the connection password masking but also execution 
environment. 
   
   ## A bit more context.
   
   Running "test_connection" in the api_server context has several problems:
   
   1) the api-server does not need to connect to outside world. In many cases 
where  security is important, the api_server will be running in the container / 
network /security perimeter which cannot even make such outside connections to 
external systems becasue it will be behind restrictive firewall. Api-server is 
usually **somewhat** exposed to outside - ie. it's theorethicallly easier for 
the attacker to break into api-server, so a number of of deployments out-there 
have api-server in quite well separated environment. 
   
   This is a functional problem with "test connection" run from api-server
   
   2) No User-provided code (not even DAG-authored code) should be runnnable on 
api-server. Running test-connection is inherently unsafe - because there are a 
number of connections and drivers that are impossible to secure. This is the 
case with ODBC for example or JDBC - where you can often pass a configuration 
to such driver that might cause RCE in the environment where the driver is 
initialized - we had MANY such security issues in the past and this was the 
main reason why we disabled the test-connection by default 
https://airflow.apache.org/docs/apache-airflow/stable/release_notes.html#disable-default-allowing-the-testing-of-connections-in-ui-api-and-cli-32052
  in Airflow 2.7.0 and warned our users that they should not enable it if they 
care about security. This was advised to our users in this CVE 
https://nvd.nist.gov/vuln/detail/cve-2023-37379 - and in the configuration we 
told them in the blog as well:
   
   https://airflow.apache.org/blog/airflow-2.7.0/
   
   > Airflow 2.7.0 is a release that focuses on security. The Airflow security 
team, working together with security researchers, identified a number of areas 
that required strengthening of security. This resulted in, among others things, 
an improved description of the [Airflow security 
model](https://airflow.apache.org/docs/apache-airflow/stable/security/security_model/),
 a better explanation of our [security 
policy](https://github.com/apache/airflow/security/policy) and the disabling of 
certain, potentially dangerous, features by default - like, for example, 
connection testing (#32052).
   
   This is a security problem with running "Test-connection" from the 
api-server.
   
   The conditions of triggering the vulnerability and dangers have not changed 
a bit since 2.7.0 - even with task isolation - those are exactly the same 
dangers that were in 2.7.0 when test-connection was disabled by default. Users 
are exactly in the same - dangerous - place where they enable it back today.
   
   ## What can we do about  with it ? 
   
   Build on generic workflow feature introduced in Deadline alerts and instead 
of running "Test-connection" in the api-server, schedule workflow to be 
executed to actually do this in "worker" or "triggerer". This is what I was 
sceptical about when test-connection was introduced:  
https://github.com/apache/airflow/pull/15795#issuecomment-839665975  and 
further discussions. What HAS changed since then is the argument of runing 
"api-server" and "workers" together does not stand any more (soon). We do 
everything possible now to enable the case where workers have not only 
different dependencies but simply - different airflow distribiutions 
("airflow-task-sdk" rather than "airflow-core") - so the main argument in that 
discusison that running Airflow in such isolated environments is an edge case - 
but in Airflow 3.2 hopefully - this will become essentially "reocmmended" way 
of running airflow. Also we have something more thna we had 4 year ago -  we 
finally are clearing the path of being ab
 le to run more generic workloads and run them through wokers/triggerer and 
provide feedback from those workloads (with deadline feature) - and we could 
follow the suite and implement it in similar way.
   
   This does not solve all the security issues but it makes our isolation way 
more consistent. With test connection on api_server you basically make UI users 
capable of running arbitrary code in api_server, and arbitrary modification of 
the database. If you move it's execution to workers- those components of 
airflow are way more isolated and are supposed to run dangerous code already, 
and usually is done in ephemeral (say k8s pods in k8s executor) - rather than 
long running environment of api_server. Which has IMHO acceptable security 
properties to even enable it by default with some security model description 
changes.
   
   My recommendation would be - not to "fix" this feature - but declare it 
"broken" and fix it properly by moving test-connection workflows to "workers". 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to