kaxil commented on pull request #19637:
URL: https://github.com/apache/airflow/pull/19637#issuecomment-972130383


   > Hmm. Maybe I am a little overcautious here, but it comes from the 
experience of responding and trying to help many of our users. I think it 
really boils down to what we want to optimize for.
   > 
   > My current optimisation "goal" is to decrease the number of problems our 
users even approach us with. So if there is a problem we can anticipate and 
solve for our users without them even knowing it, I am all for that. Adding 
more and more escape hatches like that is going quite against that 
"optimisation" goal, because we admit that there is a problem and we try to 
"throw the problems at our users to solve" - and this (from the experience) 
will hit us back.
   > 
   > This was the case with the warning for "removed" task instance entries. I 
know why we added it but IMHO it exposed an internal detail of airlfow to our 
users, and this kinda 'undermines' the feeling that airlfow is "reliable" and 
"safe to upgrade" - users who just want to install airflow without knowing how 
it works intenrnally should be first-class citizens as well, and those kind of 
"escape hatches" just leaks the "airlfow" abstraction. I think sometimes we 
tend to forget that our Ops (even admins) are not "developers" or even no "sql 
admins" should not eventually even know that there is a database behind. This 
is really an implementation detail for many.
   > 
   > But If I am the only one to be worried, I am ok with the flag (I just hope 
in the next few months we will not have to handle a floo od of issues upgrading 
from say 2.0.2).
   
   I am "-1" for the same reasons that I don't want our users to be to have to 
"unnecessarily" wait and instead of actually making sure that we don't break 
serialization we are cleaning the serialized_dags. And this comes from years of 
experience and helping users and companies as well.
   
   >Adding more and more escape hatches like that is going quite against that 
"optimisation" goal, because we admit that there is a problem and we try to 
"throw the problems at our users to solve" - and this (from the experience) 
will hit us back.
   
   That's exactly why we should not just clean the table and can still achieve 
what we need without comprising the cost of reserializing the table. And it is 
not "throw the problems at our users to solve" , it is if we goof up for 
whatever reason we have a "backup" that helps users without compromising it for 
the versions that don't have it. It is a different way of looking at the coin !!
   
   > This was the case with the warning for "removed" task instance entries. I 
know why we added it but IMHO it exposed an internal detail of airlfow to our 
users, and this kinda 'undermines' the feeling that airlfow is "reliable" and 
"safe to upgrade"
   
   That is the reason we adopted the solution I suggested to move things to a 
temporary table then ask them to cleanup dangling TIs where they didn't even 
know what they need to run. And it is not "one size fits all" problem -- there 
are users who don't care about those but there are corporate, banks and others 
who have policies to retain data for a certain period, so they can decide if 
they want that data or not.
   
   >and this kinda 'undermines' the feeling that airlfow is "reliable" and 
"safe to upgrade" - users who just want to install airflow without knowing how 
it works intenrnally should be first-class citizens as well, and those kind of 
"escape hatches" just leaks the "airlfow" abstraction
   
   Again, there are two kinds of users and it is not one size fits all. For 
"new users" / "new installs" neither is the problem -- nor the TI issue or the 
dag serialization issue. It is for the users during the "upgrade" -- the main 
benefit of open source is transparency (and this is my view). To debug why a 
DAG is not working, you _might_ need to look at parsing logs and the serialized 
representation of the DAG -- so it is not leak of abstration but understanding 
of a core process.
   
   >I think sometimes we tend to forget that our Ops (even admins) are not 
"developers" or even no "sql admins" should not eventually even know that there 
is a database behind. This is really an implementation detail for many.
   
   And we should not forget that we in trying to optimize for new users we 
don't forget the users and companies running Airflow for years who want to 
optimize it and they need tools from us. I would love to make it as easy to 
upgrade but without compromising, the solution you mentions only achieves one 
thing which is why I am trying to find the best of both worlds as we have both 
type of users.
   
   >But If I am the only one to be worried, I am ok with the flag
   
   I will be very happy to hear other solutions that fixes it for both type of 
users and I have no strong opinion on the flag, I have proposed a separate 
sub-command solution to similar to https://github.com/apache/airflow/pull/19471 
. 
   
   > (I just hope in the next few months we will not have to handle a flood of 
issues upgrading from say 2.0.2).
   
   Do you think of all the issues that we are flooded with, this one issue will 
have more?
   
   I understand what you want to solve and why and I do appreciate it but I 
think we should not take this shortcut. However this is my personal opinion 
only :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to