xyu opened a new issue #20092:
URL: https://github.com/apache/airflow/issues/20092


   ### Apache Airflow version
   
   2.2.1
   
   ### What happened
   
   #18772 added two options to `tmp_configuration_copy()` and defaults tasks 
run as the same user to generate the temp configuration file for the task 
runner with the options `include_env=False` and `include_cmds=False`. This is a 
change from the previous defaults which materialized the values of `*_cmd` 
configs into the JSON dump.
   
   This presents a problem because when using `sql_alchemy_conn_cmd` to set the 
database connection and running as the same user the temporary config JSON dump 
now includes both `sql_alchemy_conn`, set to the Airflow distribution default 
as well as the user set `sql_alchemy_conn_cmd`. And because bare settings take 
precedence over `_cmd` versions while the Airflow worker can connect to the 
configured DB the task runner itself will instead use a non-existent SQLite DB 
causing all tasks to fail.
   
   TLDR; The temp JSON config dump used to look something like:
   
   ```
   {
     "core": {
       "sql_alchemy_conn": "mysql://...",
       ...
     }
   }
   ```
   
   Now it looks something like:
   
   ```
   
   {
     "core": {
       "sql_alchemy_conn": "sqlite:////var/opt/airflow/airflow.db",
       "sql_alchemy_conn_cmd": "/etc/airflow/get-config 'core/sql_alchemy_conn'"
       ...
     }
   }
   ```
   
   But because `sql_alchemy_conn` is set `sql_alchemy_conn_cmd` never gets 
called and tasks are unable to access the Airflow DB.
   
   ### What you expected to happen
   
   I'm not quite sure what is the preferred method to fix this issue. I guess 
there are a couple options:
   - Remove the bare `sensitive_config_values` if either `_cmd` or `_secret` 
versions exist
   - Ignore the Airflow default value in addition to empty values for bare 
`sensitive_config_values` during parsing
   - Go back to materializing the sensitive configs
   
   ### How to reproduce
   
   With either Airflow 2.2.1 or 2.2.2 configure the DB with 
`sql_alchemy_conn_cmd` and remove the `sql_alchemy_conn` config from your 
config file. Then run the Airflow worker and task runner as the same user. Try 
running any task and see that the task tries to access the default SQLite store 
instead of the configured one.
   
   ### Operating System
   
   Debian Bullseye
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to