yehoshuadimarsky commented on issue #20408:
URL: https://github.com/apache/airflow/issues/20408#issuecomment-998437846


   > > I see. The docs are confusing in that regard then, I thought the 
`remote_log_conn_id` is used for both reading and writing - for all operations 
for remote logging.
   > 
   > I heartily invite you to make a PR that will make it clearer there. 
Airflow is created by > 1800 contributors - most of them contributing code and 
docs voluntarily and this is an easy way to become one of the contributors and 
"give back" for the software you get for free. This is super easy - you just 
use "Suggest a change on this page" button at the bottom-right of the doc page 
and you will get a possibility to submit changes to the dodcumentation using 
GitHub UI as a new PR. It's even easier to make a PR than to create an issue 
about it in fact. And you as the user are one of the best to know what and how 
to describe it in the way that other users will find useful.
   
   Thank you! I will definitely try to submit a PR to change the docs, thank 
you for making it easy and inviting.
   > 
   > > So maybe I can broaden my question then - why are there 2 different 
connection IDs, `remote_log_conn_id` and `google_key_path`? Maybe the GCP 
provider package should change the remote logging to use `remote_log_conn_id` 
for both read and write?
   > 
   > I belive (I have not implemented it) this is because of security. The 
conn_id is there and (currently) available to anyone who has access to the DB 
(including the webserver to read the logs). However writing should be 
exclusively available to the tasks "workload". Write access to the bucket is 
generally more "precious" and for example workload identity and using metadata 
server in GCP gives much better security, because the access to write is only 
granted to that particular machine executing the workload and only for the time 
of executing the task. Even if such "authtentication" leaks, it is useless 
outside of this single task execution. On the other hand the "conn_id" details 
are static and leaking it could give anyone unlimited and uncontrolled access 
to write any data to the bucket. Which could lead to a number of security 
issues (such as log injection vulnerabilities etc).
   
   This makes sense, thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to