potiuk commented on issue #26861:
URL: https://github.com/apache/airflow/issues/26861#issuecomment-1272320589

   I am inclining to close it as "not a bug" but would love to hear also what 
others say.
   
   While this change is indeed breaking old behaviour, one might say it is 
bringing back consistency to the behaviour of the API with other API's and 
fixing an incidental behaviour that the API shown before. 
   
   Nowhere it is mentioned, needed or even expected that the API is "single 
source of truth" - in this case there is not even single "truth" that we can 
talk about. It all depends on definition the `get_variable` function has.
   
   If we define "variable" APIs as exclusively showing and managing variables 
that are defined via DB then:
   
   a) it is consistent wihth list, update, create delete methods - it is 
impossible to modify in any way the variables defined via env variables or even 
secrets, so naturally, those methods do not work. Similarly it is just 
"standard" airflow behaviour that listing a variables defined by env variables 
or secrets is impossible. You need to know they are there. 
   
   b) if the variable is defined in both DB and env, then this is impossible to 
read DB variable which you just set with PATCH. That turns effectively the DB 
variable in "write-only"  - you can create, update, delete it. But you can't 
read it. That was very inconsistent behaviour that #25370
   
   c) the new behaviour is consistent with UI/Web behaviour - you cannot 
list/read env variables via web.
   
   d) if we leave the env variable reading, we open up for behaviour where it 
might be wrong anyway. The API endpoint runs on the webserver, so if you set 
variable via ENV variables only on workers and scheduler, but not on the 
webserver, the API will not return the right "ENV" variable value anyway. And 
TBH - the env variable for variables set on webserver are useless because none 
of the webserver code uses variables (except from manipulating the DB 
variables) - the only way you can actually use variables is in workers (and if 
you are a bit sloppy and do not follow our best practices - in DagFileProcessor 
- but this should be discourage due to performance reasons).
   
   That means that it cannot even be considered as "source of truth" because 
the "truth" will depend on whether the ENV is set consistently everywhere. It 
is not required either to have same ENV variable everywhere. You could even 
imagine that you have different types of workers and each worker has a 
different ENV variable set. For example you could use variable to get the right 
URL you should use for this particular worker to communicate - and the URL 
could be different for different workers. This is a very viable use of 
ENV-defined variables and nowhere in Airlfow we prevent or discourage doing it. 
It will "just work". In this case it's even IMPOSSIBLE to get the source of 
truth because the value of the variable from env will be different on each 
worker and `get_variable` would be completely meaningless in this context.
   
   Taking into account all of the above I have strong opinion that we should 
close it. I am going for holidays now, but if there are other committers who 
think similarly, and agree with me, then I think we should close that one.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to